Home > Uncategorized > Text Mining Question Bank

Text Mining Question Bank

1. Natural Language Processing

  1. Give 5 examples for Holonyms, Hyponyms, Hypernyms, Metonyms, Meronyms, Homonyms, Synonyms, Polysems.
  2. Draw the Venn diagram of Spellings-Meanings-Pronunciations.
  3. Why are Context Free Grammars Context free ?
  4. What is the difference between RTN and ATN ?
  5. Give examples of Prepositional Phrases.
  6. Compare CFG and ATN.
  7. Give 5 examples for Anaphora, Cataphora, Endophora, Exophora.
  8. Give 5 examples of NP ellipsis, VP ellipsis.
  9. Write a CFG, ATN for the following:
    1. “Tech Companies queue up for Open Source Professionals”.
    2. I love my language.
    3. Patriotism is not about watching cricket matches together.
    4. AMD’s microcode is more richer than Intel.
    5. Ron Weasley should marry Hermoine Granger.
    6. Krishna is a metonym for uncertainty.
    7. PMPO is 8 times that of RMS power measured for a 1KHz signal with an amplitude of 1V.
  10. What are the Named Entities in
    1. “Open Source helps Life Spring Hospitals” ?
    2. I want to work for Burning Glass Technologies Inc.
    3. The university life at SRM is very informal.
    4. AMD Phenom 5500 Black Edition can be unleashed to 4 cores.
    5. Hail Hitler!
    6. Anushka is taller than Surya.
  11. Do NP chunking on
    1. Tips and Tools for measuring the world and beating the odds
    2. The crazy frog is an awesome song
    3. Time flies like arrow.
    4. Thevaram was written by Appar.
    5. Text mining is awfully interesting.
    6. I need to get placed is a good company.
  12. Write a Regular Expression for replacing the beginning and end of all the lines in a text file with the strings “” and “” respectively.
  13. Write a regular expression for capturing Indian mobile numbers, land line numbers and Indian pin codes with maximum possible inherent validation.
  14. Write a regular expression for capturing the vehicle numbers, PAN numbers, Passport numbers in a new paper article.
  15. Identify rules to capturing dates and discriminating the job dates, education dates and date of birth.
  16. Give examples for Noun stemming in English & {Tamil or Telugu or Hindi} languages.  Transliterate the Indian language.
  17. Give examples for Verb stemming in English & {Tamil or Telugu or Hindi} languages.  Transliterate the Indian language.
  18. How does a spell checker work ?
  19. Take some arbitrary texts and summarize them in to a line or two.  Justify the reason for the choice of words and sentences in your summary.
  20. Show some examples for word-by-word, sentence-by-sentence, context-by-context machine translation.

2. Information Extraction & Statistical NLP

  1. If Prob(A) is 0.4 and Prob(B) is 0.6, what is Prob(A,B), Prob(A|B), Prob(A u B), Prob(A – B), Prob(A n B) ?  If some data is missing, assume a reasonable value for it.
  2. Let A be a random variable with instances a1, a2, a3, a4, a5.  If P(a1) = 1.8e-4, P(a2) = 5.2e-8, P(a3) = 0.042, P(a4) = 0.00052, P(a5)=0.2, compute Sigma P(A), PI P(A) without mathematical underflow.
  3. Give real life examples for 1st order markov processes.
  4. Give real life examples of Expectation-Maximization.
  5. If p[[0.1 0.3 0.2 0.4],[0.3 0.4 0.2 0.1],[0.3 0.3 0.1 0.3], [0.2 0.4 0.1 0.3]] is the state transition probability of any 4 states {A,B,C,D} in a HMM, calculate P(A->B->C->D).
  6. Based on (5), check whether the probability of state sequence is commutative (ex: P(A->B->C) = P(C->B->A) ?)
  7. If the observation probability is [[.2 .4 .1 .3], [.6 .1 .0 .3], [.0 .0 .0 1.0], [.1 .1 .1 .7], [.4 .4 .1 .1]] for observations {i, j, k, l, m} in states as per(5). Compute the P(O={k,l}).
  8. Annotate the items in (9) of Section 1 and build the state transition, observation, initial probability matrices.
  9. Show that usage of forward probabilities reduce the time-complexity of evaluation problem.
  10. Show that usage of forward-backward probabilities reduce the time-complexity of decoding problem.

    Powered by ScribeFire.

  1. No comments yet.
  1. No trackbacks yet.
You must be logged in to post a comment.