Calculate Document Similarity
Requirements:
1.Create a scenario of when and why you might want to determine if comments are positive or negative (or male/female or pass/fail or any other “binary” categorization). Explain how the results could be used.
2.You must read the data in from a file.
3.Required to implement the following below:
oVectorization method/tool like the examples (provided) uses sklearn count.vectorizer but you can use any vectorization tool or Jaccard Distance
oNLTK Grammar Based Feature Extraction: context-free grammar, syntactic parsers, extracting key phases or extracting entities (only required to use one of those).
4.Create some kind of a dictionary of sample words you will use to search /categorize your data.
5.Display each output results with summary
Already registered? Login
Not Account? Sign up
Enter your email address to reset your password
Back to Login? Click here