Data to information
Mar 1, 2024
Plain text
Sound is a vibration. Sound travels as a mechanical wave through a medium, and in space, there is no medium. So when my shuttle malfunctioned and the airlocks didn't keep the air in, I heard nothing. After the first whoosh of the air being sucked away, there was lightning, but no thunder. Eyes bulging in panic, but no screams. Quiet and peaceful, right? Such a relief to never again hear my crewmate Jesse natter about his girl back on Earth and that all-expenses-paid vacation-for-two she won last time he was on leave. I swore, if I ever had to see a photo of him in a skimpy bathing suit again, giving the camera a cheesy thumbs-up from a lounge chair on one of those white sandy beaches, I'd kiss a monkey. Metaphorically, of course.
XML
<document>
<sent id="1">
<word id="1" modality="written">Sound</word>
<word id="2" modality="written">is</word>
<word id="3" modality="written">a</word>
<word id="4" modality="written">vibration</word>
<word id="5" modality="written">.</word>
</sent>
<sent id="2">
<word id="1" modality="written">Sound</word>
<word id="2" modality="written">travels</word>
<word id="3" modality="written">as</word>
<word id="4" modality="written">a</word>
<word id="5" modality="written">mechanical</word>
<word id="6" modality="written">wave</word>
<word id="7" modality="written">through</word>
<word id="8" modality="written">a</word>
<word id="9" modality="written">medium</word>
<word id="10" modality="written">,</word>
<word id="11" modality="written">and</word>
<word id="12" modality="written">in</word>
<word id="13" modality="written">space</word>
<word id="14" modality="written">,</word>
<word id="15" modality="written">there</word>
<word id="16" modality="written">is</word>
<word id="17" modality="written">no</word>
<word id="18" modality="written">medium</word>
<word id="19" modality="written">.</word>
</sent>
</document>
JSON
{
"document": {
"sent": [
{
"id": "1",
"word": [
{
"id": "1",
"modality": "written",
"word": "Sound"
},
{
"id": "2",
"modality": "written",
"word": "is"
},
{
"id": "3",
"modality": "written",
"word": "a"
},
{
"id": "4",
"modality": "written",
"word": "vibration"
},
{
"id": "5",
"modality": "written",
"word": "."
}
]
},
{
"id": "2",
"word": [
{
"id": "1",
"modality": "written",
"word": "Sound"
},
{
"id": "2",
"modality": "written",
"word": "travels"
},
{
"id": "3",
"modality": "written",
"word": "as"
}]
}
}
}
Physical Semantic
fs
: File system operationsreadr
: Read plain text, csv, tsv, and other delimited filesstringr
: String manipulation, find/ replace (regular expressions)tidyr
: Tidy data, reshape data (more transformational)Characteristics
Already tidy!
Characteristics
Requires tidying!
Approaches
Characteristics
Requires tidying!
Approaches
Approaches
Curate | Quantitative Text Analysis | Wake Forest University