View the step-by-step solution to:

Question

Part B: Readability

This program will read in the contents of a text file containing a normal text document

and reorganize its contents by separately storing each sentence of the text.

For our purposes, the end of a sentence is marked by any word that ends with one of the three characters . ? !

Read in the lines of text, process them into words, and store them into an array (or, optionally, an ArrayList) of objects of your Sentence class. Each Sentence object should start out empty, and have words added to it as they are received from the file. Your Sentence class must contain at least the following:

  • text— (String) the text of the sentence;
  • wordCount— (int) the number of words in the sentence (only count words, where a word contains one or more letters); and
  • add(String word)— add the given word to the sentence.

If you are storing the sentences in an array, you can assume that the text file contains at most 1000 sentences.

Once you have read in the contents of the file, process it in the following way:

  1. Print the first five sentences in the file. Number the sentences according to their sequence in the document (the first sentence is number 1).
  2. Print the last five sentences in the file. Number the sentences according to their sequence in the document.
  3. Print summary statistics over the entire document, including the number of letters (counting only letters, not digits, spaces, or other punctuation), words, and sentences, and the Automated Readability Index of the text. The ARI is calculated as follows: and provides an estimate of the readability of the text according to its grade level. Round it to one decimal place.

For example, given the following text file:

Text that follows is based on the Wikipedia page on cryptography! Cryptography is the practice and study of hiding information. In modern times, cryptography is considered to be a branch of both mathematics and computer science, and is affiliated closely with information theory, computer security, and engineering. Cryptography is used in applications present in technologically advanced societies; examples include the security of ATM cards, computer passwords, and electronic commerce, which all depend on cryptography. Until modern times, cryptography referred almost exclusively to encryption, the process of converting ordinary information (plaintext) into unintelligible gibberish (i.e., ciphertext). Decryption is the reverse, moving from unintelligible ciphertext to plaintext. A cipher (or cypher) is a pair of algorithms which perform this encryption and the reversing decryption. The detailed operation of a cipher is controlled both by the algorithm and, in each instance, by a key. This is a secret parameter (ideally, known only to the communicants) for a specific message exchange context. Keys are important, as ciphers without variable keys are trivially breakable and therefore less than useful for most purposes. Historically, ciphers were often used directly for encryption or decryption, without additional procedures such as authentication or integrity checks. In colloquial use, the term "code" is often used to mean any method of encryption or concealment of meaning. However, in cryptography, code has a more specific meaning; it means the replacement of a unit of plaintext (i.e., a meaningful word or phrase) with a code word (for example, apple pie replaces attack at dawn). Codes are no longer used in serious cryptography - except incidentally for such things as unit designations (e.g., 'Bronco Flight' or Operation Overlord) - since properly chosen ciphers are both more practical and more secure than even the best codes, and better adapted to computers as well. Some use the terms cryptography and cryptology interchangeably in English, while others use cryptography to refer specifically to the use and practice of cryptographic techniques, and cryptology to refer to the combined study of cryptography and cryptanalysis. The Ancient Greek scytale (rhymes with Italy), probably much like this modern reconstruction, may have been one of the earliest devices used to implement a cipher. Before the modern era, cryptography was concerned solely with message confidentiality (i.e., encryption) - conversion of messages from a comprehensible form into an incomprehensible one, and back again at the other end, rendering it unreadable by interceptors or eavesdroppers without secret knowledge (namely, the key needed for decryption of that message). In recent decades, the field has expanded beyond confidentiality concerns to include techniques for message integrity checking, sender/receiver identity authentication, digital signatures, interactive proofs, and secure computation, amongst others. The earliest forms of secret writing required little more than local pen and paper analogs, as most people could not read. More literacy, or opponent literacy, required actual cryptography. The main classical cipher types are transposition ciphers, which rearrange the order of letters in a message (e.g., 'help me' becomes 'ehpl em' in a trivially simple rearrangement scheme), and substitution ciphers, which systematically replace letters or groups of letters with other letters or groups of letters (e.g., 'fly at once' becomes 'gmz bu podf' by replacing each letter with the one following it in the alphabet). Simple versions of either offered little confidentiality from enterprising opponents, and still don't. An early substitution cipher was the Caesar cipher, in which each letter in the plaintext was replaced by a letter
Background image of page 1
some fixed number of positions further down the alphabet. It was named after Julius Caesar who is reported to have used it, with a shift of 3, to communicate with his generals during his military campaigns, just like EXCESS-3 code in boolean algebra. Encryption attempts to ensure secrecy in communications, such as those of spies, military leaders, and diplomats, but it has also had religious applications. For instance, early Christians used cryptography to obfuscate some aspects of their religious writings to avoid the near certain persecution they would have faced had they been less cautious; famously, 666 or in some early manuscripts, 616, the Number of the Beast from the Christian New Testament Book of Revelation, is sometimes thought to be a ciphertext referring to the Roman Emperor Nero, one of whose policies was persecution of Christians. There is record of several, even earlier, Hebrew ciphers as well. Cryptography is recommended in the Kama Sutra as a way for lovers to communicate without inconvenient discovery. Steganography (i.e., hiding even the existence of a message so as to keep it confidential) was also first developed in ancient times. An early example, from Herodotus, concealed a message - a tattoo on a slave's shaved head - under the regrown hair. More modern examples of steganography include the use of invisible ink, microdots, and digital watermarks to conceal information. Ciphertexts produced by classical ciphers (and some modern ones) always reveal statistical information about the plaintext, which can often be used to break them. After the discovery of frequency analysis (perhaps by the Arab polymath al-Kindi) about the 9th century, nearly all such ciphers became more or less readily breakable by an informed attacker. Such classical ciphers still enjoy popularity today, though mostly as puzzles (see cryptogram). Essentially all ciphers remained vulnerable to cryptanalysis using this technique until the invention of the polyalphabetic cipher, most clearly by Leon Battista Alberti around the year 1467 (though there is some indication of earlier Arab knowledge of them). Alberti's innovation was to use different ciphers (i.e., substitution alphabets) for various parts of a message (perhaps for each successive plaintext letter in the limit). He also invented what was probably the first automatic cipher device, a wheel which implemented a partial realization of his invention. In the polyalphabetic Vigenère cipher, encryption uses a key word, which controls letter substitution depending on which letter of the key word is used. In the mid 1800s Babbage showed that polyalphabetic ciphers of this type remained partially vulnerable to frequency analysis techniques. The Enigma machine, used in several variants by the German military between the late 1920s and the end of World War II, implemented a complex electro-mechanical polyalphabetic cipher to protect sensitive communications. Breaking the Enigma cipher at the Biuro Szyfrów, and the subsequent large-scale decryption of Enigma traffic at Bletchley Park, was an important factor contributing to the Allied victory in WWII. Although frequency analysis is a powerful and general technique, encryption was still often effective in practice; many a would-be cryptanalyst was unaware of the technique. Breaking a message without frequency analysis essentially required knowledge of the cipher used, thus encouraging espionage, bribery, burglary, defection, etc. to discover it. It was finally explicitly recognized in the 19th century that secrecy of a cipher's algorithm is not a sensible or practical safeguard; in fact, it was further realized any adequate cryptographic scheme (including ciphers) should remain secure even if the adversary fully understands the cipher algorithm itself. Secrecy of the key should
Background image of page 2
Show entire document

Top Answer

Sign up to view the full answer

Why Join Course Hero?

Course Hero has all the homework and study help you need to succeed! We’ve got course-specific notes, study guides, and practice tests along with expert tutors.

-

Educational Resources
  • -

    Study Documents

    Find the best study resources around, tagged to your specific courses. Share your own to gain free Course Hero access.

    Browse Documents
  • -

    Question & Answers

    Get one-on-one homework help from our expert tutors—available online 24/7. Ask your own questions or browse existing Q&A threads. Satisfaction guaranteed!

    Ask a Question
Ask a homework question - tutors are online