Lecture 14
Duke University
STA 199 - Fall 2025
October 9, 2025
The following code in chronicle-scrape.R
extracts titles of an opinion article from The Chronicle website:
Which of the following needs to change to extract column titles instead?
read_html()
html_elements()
to html_element()
.space-y-4 .font-extrabold
to .space-y-4 .text-brand
html_text()
to html_attr()
Scan the QR code or go to app.wooclap.com/sta199. Log in with your Duke NetID.
HW 2, Question 7: Reproduce the colorful box plot – We caught an error in grading (any theme with a white background would have worked). If you originally missed points due to not using theme_bw()
, but you used another theme with a white background, we’ve updated your grade.
Midsemester course survey due tonight at 11:59pm
Project proposals (Milestone 2) + first peer evaluation due next Thursday at 11:59pm – any questions?
Go to https://www2.stat.duke.edu/~cr173/data/dukechronicle-opinion/www.dukechronicle.com/section/opinionabc4.html (copy of The Chronicle opinion section as of October 7, 2025).
Go to your ae project in RStudio.
If you haven’t yet done so, make sure all of your changes up to this point are committed and pushed, i.e., there’s nothing left in your Git pane.
If you haven’t yet done so, click Pull to get today’s application exercise file: ae-09-chronicle-scrape.qmd and chronicle-scrape.R
.
Put the folllowing tasks in order to scrape data from a website:
read_html()
to read the page’s source code into RScan the QR code or go to app.wooclap.com/sta199. Log in with your Duke NetID.
When working in a Quarto document, your analysis is re-run each time you knit
If web scraping in a Quarto document, you’d be re-scraping the data each time you knit, which is undesirable (and not nice)!
An alternative workflow:
What additional information do we need to produce the table below?
Column | Avg. # words/article | # articles |
---|---|---|
Campus Voices | 942 | 382 |
Letters To The Editor | 307 | 19 |
Opinion | 1020 | 99 |
.article-content
{xml_nodeset (1)}
[1] <article class="full-article arx-styles prose max-w-3xl lg:prose-lg text-gray-900 bs-shim article-content" id="article-content-d976e3a0-027c-447d-a162-bd10f5d921a3"><p spellcheck="false" aria-label="To enrich screen reader interactions, please activate Accessibility in Grammarly extension settings">Fraternities thrive on exclusivity. At Duke, that posture may be tolerable, even expected. But once you’ve seen how Greek life operates (gatekeeping at the door, bending rules for insiders, pun ...
[1] "Fraternities thrive on exclusivity. At Duke, that posture may be tolerable, even expected. But once you’ve seen how Greek life operates (gatekeeping at the door, bending rules for insiders, punishing outsiders) it’s hard not to notice the same logic playing out on a national scale. The United States, too, runs like a frat: immigration represents the rush process, the Constitution and the insiders who decide when those rules apply and when they don’t.\n\nThe First Test: Do You Belong?\n\nImmigration is America’s bid night. Who gets past the door and who gets turned away, reveals who the frat thinks deserves to wear its letters. \n\nIn the late 1800s, the Chinese Exclusion Act was the ultimate “you don’t belong here.” It was the equivalent of a fraternity cutting pledges at the door, not because of merit, but because of appearance.\n\nHistorian Andrew Gyory argues in “Closing the Gate: Race, Politics, and the Chinese Exclusion Act,” that the 1882 Chinese Exclusion law wasn’t just about jobs or economics — it was about racial scapegoating weaponized in a crisis. In the recession of the 1870s, white workers blamed Chinese laborers for stealing jobs and depressing wages. Politicians ran with it, branding Chinese immigrants as “coolies” (a slur) who threatened white labor. This act suspended immigration, denied naturalization and subjected exempt merchants and students to humiliating interrogation. \n\nThe same script is running today. Just as Chinese laborers were scapegoated during the 1870s downturn, today’s immigrants are painted as threats to American workers. Trump says migrants “steal jobs” and “drive down wages,” while ignoring the actual structural causes of inequality. Wealth-based hurdles for H-1B visas, restrictions on asylum work permits: all built on the same assumption that keeping immigrants out will magically protect American jobs. The target changes, but the logic does not: protect the insiders, keep the outsiders desperate.\n\nIt’s not “No Asians allowed” anymore; it’s “You can only enter if you can pay the toll.” The words have shifted — “merit,” “abuse,” “national security” — but the message hasn’t. Some people belong. Others don’t. \n\nGatekeeping at the border bleeds into gatekeeping the Constitution itself. Once you’re in though, it may not be what you expect it to. \n\nInside the House\n\nOnce you’re in, the house makes the rules. Fraternities are notorious for setting the tone of how members act, who they associate with and even how they speak. The U.S. works the same way: once you’re past the border, culture and politics dictate what’s acceptable and what’s excused.\n\nTrump’s infamous Access Hollywood tape was brushed off as “locker room talk.” Anyone who’s been in a frat knows exactly what that means: a culture where misogyny isn’t condemned; it’s normalized as tradition. And Trump isn’t an outlier. Defense Secretary Pete Hegseth recently scolded generals for being “fat” in the halls of the Pentagon. He said, “It's completely unacceptable to see fat generals and admirals in the halls of the Pentagon and leading commands around the country and the world.”\n\nWhat starts as frat culture seeps into governance. The overlap isn’t just metaphorical anymore. It’s literal.\n\nThe Second Test: Will You Submit?\n\nEvery frat has rules, but everyone knows they bend when it suits those in charge. National bylaws formally ban hazing and restrict alcohol, yet incidents from Penn State and Bowling Green prove those rules are selectively enforced. They are college kids, not U.S. government officials. So, no. No leeway here. \n\nThe Constitution is supposed to be America’s house bylaws, but we’re watching them get bent the same way fraternity brothers bend their codes of conduct. \n\nThe Fifth Amendment says: “No person shall … be deprived of life, liberty, or property, without due process of law.” No person. Not no citizen. Yet under Trump’s use of the Alien Enemies Act, Venezuelan migrants have been deported straight into El Salvador’s prisons — without hearings, charges or convictions. That’s not due process. That’s rule by decree.\n\nThe First Amendment is also under attack. Student activists like Rümeysa Öztürk have been detained by ICE for nothing more than co-writing an article. Visas revoked, scholars deported — not because of crime, but because of speech. This amendment also doesn’t specify “citizens,” it specifies “the people.”\n\nThe Fourteenth Amendment’s Equal Protection Clause is hanging by a thread. Trump’s executive order to end birthright citizenship — blocked in court but revealing in intent — openly defies the guarantee that “All persons born or naturalized in the United States… are citizens.” If the president of the United States feels free to ignore the Constitution’s plain text, what comes next?\n\nThe Third Test: Will You Stay Silent? \n\nI don’t deny a country, or a club, has the right to control its borders or rules. Sovereignty matters. Rules are necessary. But we have slipped into what I call selective constitutionalism: rules enforced for some (Second Amendment) and not others (Fifth Amendment). \n\nViktor Orbán’s Hungary is the case study: elections still happen, but rules are bent until liberal democracy is hollowed out and only the shell remains. \n\nA common rebuttal is: “Trump just says things, he doesn’t mean them.” That’s a lie we tell ourselves to stay comfortable. When I say something outrageous, it’s just words. When a president says the same thing, it becomes marching orders. Presidential words are not idle. They are policy.\n\nEven when I disagree with a law — like the Second Amendment — I still believe in respecting due process. If Biden abolished gun rights tomorrow by executive order, I would oppose it. Because the process matters. Without it, the rule of law is gone.\n\nI use this metaphor because it shouldn’t be foreign to any of us: if you’ve witnessed the selectivism that exists in Greek life at Duke, or seen rules bent to protect a brother, you’ve seen the same logic, when scaled up, applied to our nation's Constitution. What’s tolerable as a flawed college club becomes catastrophic when it defines a nation.\n\nWhen rules are bent long enough, they don’t snap. They disappear. In a frat, that leaves chaos. In a country, it leaves tyranny.\n\nNoor Nazir is a Trinity junior. Her columns typically run on alternate Tuesdays. "
[1] "Fraternities thrive on exclusivity. At Duke, that posture may be tolerable, even expected. But once you’ve seen how Greek life operates (gatekeeping at the door, bending rules for insiders, punishing outsiders) it’s hard not to notice the same logic playing out on a national scale. The United States, too, runs like a frat: immigration represents the rush process, the Constitution and the insiders who decide when those rules apply and when they don’t.The First Test: Do You Belong?Immigration is America’s bid night. Who gets past the door and who gets turned away, reveals who the frat thinks deserves to wear its letters. In the late 1800s, the Chinese Exclusion Act was the ultimate “you don’t belong here.” It was the equivalent of a fraternity cutting pledges at the door, not because of merit, but because of appearance.Historian Andrew Gyory argues in “Closing the Gate: Race, Politics, and the Chinese Exclusion Act,” that the 1882 Chinese Exclusion law wasn’t just about jobs or economics — it was about racial scapegoating weaponized in a crisis. In the recession of the 1870s, white workers blamed Chinese laborers for stealing jobs and depressing wages. Politicians ran with it, branding Chinese immigrants as “coolies” (a slur) who threatened white labor. This act suspended immigration, denied naturalization and subjected exempt merchants and students to humiliating interrogation. The same script is running today. Just as Chinese laborers were scapegoated during the 1870s downturn, today’s immigrants are painted as threats to American workers. Trump says migrants “steal jobs” and “drive down wages,” while ignoring the actual structural causes of inequality. Wealth-based hurdles for H-1B visas, restrictions on asylum work permits: all built on the same assumption that keeping immigrants out will magically protect American jobs. The target changes, but the logic does not: protect the insiders, keep the outsiders desperate.It’s not “No Asians allowed” anymore; it’s “You can only enter if you can pay the toll.” The words have shifted — “merit,” “abuse,” “national security” — but the message hasn’t. Some people belong. Others don’t. Gatekeeping at the border bleeds into gatekeeping the Constitution itself. Once you’re in though, it may not be what you expect it to. Inside the HouseOnce you’re in, the house makes the rules. Fraternities are notorious for setting the tone of how members act, who they associate with and even how they speak. The U.S. works the same way: once you’re past the border, culture and politics dictate what’s acceptable and what’s excused.Trump’s infamous Access Hollywood tape was brushed off as “locker room talk.” Anyone who’s been in a frat knows exactly what that means: a culture where misogyny isn’t condemned; it’s normalized as tradition. And Trump isn’t an outlier. Defense Secretary Pete Hegseth recently scolded generals for being “fat” in the halls of the Pentagon. He said, “It's completely unacceptable to see fat generals and admirals in the halls of the Pentagon and leading commands around the country and the world.”What starts as frat culture seeps into governance. The overlap isn’t just metaphorical anymore. It’s literal.The Second Test: Will You Submit?Every frat has rules, but everyone knows they bend when it suits those in charge. National bylaws formally ban hazing and restrict alcohol, yet incidents from Penn State and Bowling Green prove those rules are selectively enforced. They are college kids, not U.S. government officials. So, no. No leeway here. The Constitution is supposed to be America’s house bylaws, but we’re watching them get bent the same way fraternity brothers bend their codes of conduct. The Fifth Amendment says: “No person shall … be deprived of life, liberty, or property, without due process of law.” No person. Not no citizen. Yet under Trump’s use of the Alien Enemies Act, Venezuelan migrants have been deported straight into El Salvador’s prisons — without hearings, charges or convictions. That’s not due process. That’s rule by decree.The First Amendment is also under attack. Student activists like Rümeysa Öztürk have been detained by ICE for nothing more than co-writing an article. Visas revoked, scholars deported — not because of crime, but because of speech. This amendment also doesn’t specify “citizens,” it specifies “the people.”The Fourteenth Amendment’s Equal Protection Clause is hanging by a thread. Trump’s executive order to end birthright citizenship — blocked in court but revealing in intent — openly defies the guarantee that “All persons born or naturalized in the United States… are citizens.” If the president of the United States feels free to ignore the Constitution’s plain text, what comes next?The Third Test: Will You Stay Silent? I don’t deny a country, or a club, has the right to control its borders or rules. Sovereignty matters. Rules are necessary. But we have slipped into what I call selective constitutionalism: rules enforced for some (Second Amendment) and not others (Fifth Amendment). Viktor Orbán’s Hungary is the case study: elections still happen, but rules are bent until liberal democracy is hollowed out and only the shell remains. A common rebuttal is: “Trump just says things, he doesn’t mean them.” That’s a lie we tell ourselves to stay comfortable. When I say something outrageous, it’s just words. When a president says the same thing, it becomes marching orders. Presidential words are not idle. They are policy.Even when I disagree with a law — like the Second Amendment — I still believe in respecting due process. If Biden abolished gun rights tomorrow by executive order, I would oppose it. Because the process matters. Without it, the rule of law is gone.I use this metaphor because it shouldn’t be foreign to any of us: if you’ve witnessed the selectivism that exists in Greek life at Duke, or seen rules bent to protect a brother, you’ve seen the same logic, when scaled up, applied to our nation's Constitution. What’s tolerable as a flawed college club becomes catastrophic when it defines a nation.When rules are bent long enough, they don’t snap. They disappear. In a frat, that leaves chaos. In a country, it leaves tyranny.Noor Nazir is a Trinity junior. Her columns typically run on alternate Tuesdays. "
parse_article_page()
parse_article_page <- function(url) { # define a function with one argument, url
article_page <- read_html(url) # read in the page at the url
article_page |> # start with the page
html_elements(".article-content") |> # extract element w/ selector .article-content
html_text2() |> # extract text from element (and clean it up)
str_remove_all("\n") # remove all newline characters, return result
}
# A tibble: 1 × 2
title article
<chr> <chr>
1 The United States is a frat. And I can’t unsee it. "Fraternities thrive on exclusivity. At Duke, that posture may be tolerable, even expected. But once you’ve seen how Greek life operates (gatekeep…
# A tibble: 3 × 2
# Rowwise:
title article
<chr> <chr>
1 The United States is a frat. And I can’t unsee it. "Fraternities thrive on exclusivity. At Duke, that posture may be tolerable, even expected. But once you’ve seen how Greek life operates (gatekeep…
2 The problem with censorship and discourse at Duke "In the wake of the tragic assassination of right-wing influencer Charlie Kirk, the Trump administration has attempted to crack down on what they …
3 The 'Duke Difference' we actually need "A week ago, hundreds of Duke students filled Page Auditorium to hear Former U.S. Secretary of Transportation Pete Buttigieg speak. He outlined a …
# A tibble: 3 × 2
title article
<chr> <chr>
1 The United States is a frat. And I can’t unsee it. "Fraternities thrive on exclusivity. At Duke, that posture may be tolerable, even expected. But once you’ve seen how Greek life operates (gatekeep…
2 The problem with censorship and discourse at Duke "In the wake of the tragic assassination of right-wing influencer Charlie Kirk, the Trump administration has attempted to crack down on what they …
3 The 'Duke Difference' we actually need "A week ago, hundreds of Duke students filled Page Auditorium to hear Former U.S. Secretary of Transportation Pete Buttigieg speak. He outlined a …
This can take a bit to run!
# A tibble: 500 × 8
title author date_time month day column url article
<chr> <chr> <dttm> <chr> <dbl> <chr> <chr> <chr>
1 The Un… Noor … 2025-10-07 10:00:00 Oct 7 Opini… http… "Frate…
2 The pr… Harri… 2025-10-07 10:00:00 Oct 7 Campu… http… "In th…
3 The 'D… Gabri… 2025-10-06 14:30:00 Oct 6 Campu… http… "A wee…
4 Death … Luke … 2025-10-06 10:00:00 Oct 6 Campu… http… "Some …
5 Hazing… Monda… 2025-10-06 04:00:00 Oct 6 Campu… http… "Edito…
6 Duke’s… Lucas… 2025-10-04 10:00:00 Oct 4 Campu… http… "Duke …
7 The wo… Leo G… 2025-10-03 10:00:00 Oct 3 Campu… http… "Recen…
8 We’ve … Kayle… 2025-10-02 14:00:00 Oct 2 Opini… http… "As a …
9 How Du… Neel … 2025-10-01 10:00:00 Oct 1 Campu… http… "Comin…
10 Why ar… Ryan … 2025-10-01 10:00:00 Oct 1 Campu… http… "The e…
# ℹ 490 more rows
Now that you have the data, how would you produce the summary table below?
Column | Avg. # words/article | # articles |
---|---|---|
Campus Voices | 942 | 382 |
Letters To The Editor | 307 | 19 |
Opinion | 1020 | 99 |
# A tibble: 500 × 9
n_words title author date_time month day column url
<dbl> <chr> <chr> <dttm> <chr> <dbl> <chr> <chr>
1 1006 The Un… Noor … 2025-10-07 10:00:00 Oct 7 Opini… http…
2 1189 The pr… Harri… 2025-10-07 10:00:00 Oct 7 Campu… http…
3 778 The 'D… Gabri… 2025-10-06 14:30:00 Oct 6 Campu… http…
4 614 Death … Luke … 2025-10-06 10:00:00 Oct 6 Campu… http…
5 534 Hazing… Monda… 2025-10-06 04:00:00 Oct 6 Campu… http…
6 839 Duke’s… Lucas… 2025-10-04 10:00:00 Oct 4 Campu… http…
7 1212 The wo… Leo G… 2025-10-03 10:00:00 Oct 3 Campu… http…
8 1245 We’ve … Kayle… 2025-10-02 14:00:00 Oct 2 Opini… http…
9 937 How Du… Neel … 2025-10-01 10:00:00 Oct 1 Campu… http…
10 1041 Why ar… Ryan … 2025-10-01 10:00:00 Oct 1 Campu… http…
# ℹ 490 more rows
# ℹ 1 more variable: article <chr>
# A tibble: 500 × 9
# Groups: column [3]
title author date_time month day column url article
<chr> <chr> <dttm> <chr> <dbl> <chr> <chr> <chr>
1 The Un… Noor … 2025-10-07 10:00:00 Oct 7 Opini… http… "Frate…
2 The pr… Harri… 2025-10-07 10:00:00 Oct 7 Campu… http… "In th…
3 The 'D… Gabri… 2025-10-06 14:30:00 Oct 6 Campu… http… "A wee…
4 Death … Luke … 2025-10-06 10:00:00 Oct 6 Campu… http… "Some …
5 Hazing… Monda… 2025-10-06 04:00:00 Oct 6 Campu… http… "Edito…
6 Duke’s… Lucas… 2025-10-04 10:00:00 Oct 4 Campu… http… "Duke …
7 The wo… Leo G… 2025-10-03 10:00:00 Oct 3 Campu… http… "Recen…
8 We’ve … Kayle… 2025-10-02 14:00:00 Oct 2 Opini… http… "As a …
9 How Du… Neel … 2025-10-01 10:00:00 Oct 1 Campu… http… "Comin…
10 Why ar… Ryan … 2025-10-01 10:00:00 Oct 1 Campu… http… "The e…
# ℹ 490 more rows
# ℹ 1 more variable: n_words <dbl>
with the kable()
function from the knitr
package:
chronicle_article |>
mutate(n_words = str_count(article, " ") + 1) |>
group_by(column) |>
summarize(
avg_n_words = mean(n_words),
n_articles = n()
) |>
kable(
col.names = c("Column", "Avg. # words/article", "# articles")
)
Column | Avg. # words/article | # articles |
---|---|---|
Campus Voices | 941.8848 | 382 |
Letters To The Editor | 306.6316 | 19 |
Opinion | 1020.2121 | 99 |
chronicle_article |>
mutate(n_words = str_count(article, " ") + 1) |>
group_by(column) |>
summarize(
avg_n_words = mean(n_words),
n_articles = n()
) |>
kable(
col.names = c("Column", "Avg. # words/article", "# articles"),
digits = 0
)
Column | Avg. # words/article | # articles |
---|---|---|
Campus Voices | 942 | 382 |
Letters To The Editor | 307 | 19 |
Opinion | 1020 | 99 |