Assignments‎ > ‎Project‎ > ‎

Project Requirements

Summary

For the project, you will read in a file and use dictionaries to calculate how many times letters and words appear. You will then use this dictionary to generate a CSV file with the following information (in this order):

  • Letter: A lowercase letter.
  • Count: The total number of times that letter (in any case) appears in the file.
  • Lowercase Count: The total number of times that lowercase letter appears in the file.
  • Uppercase Count: The total number of times that uppercase letter appears in the file.
  • Starts With Count: The number of unique words that start with that letter.
  • Contains Count: The number of unique words that contain that letter (anywhere in the word).
See Project Submission for submission instructions.

Hints

I suggest you work ITERATIVELY. First, get just the letter and count to output to the CSV file. Then, modify your code so you can also track the lowercase versus uppercase count. Add only 1 or 2 columns at a time.

I also suggest you take a look at the code examples in the Lectures section.

Examples

Here is a simple text file:

AWAY TO THE RIVER

Away to the river, away to the wood,
While the grasses are green and the berries are good!
Where the locusts are scraping their fiddles and bows,
And the bees keep a-coming wherever one goes.

Oh, it's off to the river and off to the hills,
To the land of the bloodroot and wild daffodils,
With a buttercup blossom to color my chin,
And a basket of burs to put sandberries in.

Excerpt from "The Peter Patter Book of Nursery Rhymes" by Leroy F. Jackson, Rand McNally & Company, 1918.

And the resulting CSV file:

a,29,24,5,5,16
b,11,10,1,10,11
c,10,9,1,3,10
d,17,17,0,1,10
e,49,46,3,1,24
f,12,11,1,3,6
g,6,6,0,4,6
h,21,20,1,1,10
i,17,16,1,2,15
j,1,0,1,1,1
k,4,4,0,1,4
l,14,13,1,3,12
m,7,6,1,2,7
n,19,17,2,1,14
o,38,36,2,4,20
p,8,6,2,3,8
q,0,0,0,0,0
r,33,29,4,3,22
s,23,23,0,2,18
t,33,29,4,3,13
u,6,6,0,0,5
v,4,3,1,0,2
w,10,6,4,6,8
x,1,1,0,0,1
y,10,9,1,0,8
z,0,0,0,0,0

Lets make sure the output makes sense. Highlighted in yellow in the text are all the uppercase "A"s, of which there are 5. The unique words that start with "a" are:

'away', 'are', 'and', 'acoming', 'a'

Hence, the number of UNIQUE words that start with "a" are 5. There are no "z" letters in the text, so all of those counts should be 0.

Project Files

SelectionFile type iconFile nameDescriptionSizeRevisionTimeUser
SelectionFile type iconFile nameDescriptionSizeRevisionTimeUser
ċ

Download
Text file to use as input.  1k v. 3 Aug 3, 2012, 5:37 PM Sophie Engle
ċ

Download
Resulting CSV file.  1k v. 3 Aug 3, 2012, 5:37 PM Sophie Engle
Comments