Great Deal! Get Instant $10 FREE in Account on First Order + 10% Cashback on Every Order Order Now

Objective: Implement a keyword search interface that enables:(A) web search via a web search API; (B) local search on a given dataset.(A) Web search (40 points): The search interface should look and...

1 answer below »
Objective: Implement a keyword search interface that enables:(A) web search via a web search API; (B) local search on a given dataset.(A) Web search (40 points): The search interface should look and function similar to a mainstream web search engine. Search results should contain title and snippet information. Titles should be clickable linking to their corresponding pages. You may use either Google, Bing, or Yahoo web search API. You may use any programming language you prefer. The yahooAPI.pdf (maybe out-of-dated) is provided as a reference only. You do not need it to complete this assignment. Your implementation should be hosted as a web service. The following link has info for setting up your TxState Linux account URL (e.g., you could create a simple “hello world” html file and name it index.html, put it in the public_html directory under your home directory, then visit http://cs.txstate.edu/~NetID). Feel free to use a different URL. https://cs.txstate.edu/resources/labs/accounts/linux/(B) Local search (60 points): You may use the same search interface as in Part (A), where you can use buttons to allow the user to switch between web search and local search. You may also use a separate interface. Ideally, the search interface should be hosted as a web service that allows public access. Note that your university Linux account will not work for this purpose because you don’t have installation permission on university servers. You’ll need to set up your own server (e.g., using Apache) and URL. You’re responsible for reading online tutorials and working on it independently. If you fail to host the web service, you’ll need to schedule an in-person demo with me for grading, and your implementation will be subject to a minor deduction (only a few points as this is not one of the learning objectives for this course. Nonetheless, it’s an integral part of a real-life search engine prototype and it’s a useful skill for CS students/IT practitioners).For this implementation, you may use Lucene, or another open source platform (such as Solr or Elasticsearch) of your choice. You’re responsible for reading online tutorials and working on it independently. The provided dataset, lyrics.csv, contains 50 years of pop music lyrics (modified from a source file in https://github.com/walkerkq/musiclyrics). It’s up to your own interest (no bonus/credit) to index additional datasets such as Wikipedia and Amazon reviews. For the lyrics.csv dataset, each song is considered a document containing rank, title, year, artist and lyrics information. Your search interface should allow the user to enter keywords as queries (similar to Google). For each query, a list of search results should be clearly displayed. Each search result corresponds to a document (song). For each search result, show the title, rank, year and artist information (but no actual lyrics) for the corresponding document, as well as a dynamically generated snippet. The title should be made clickable and upon clicking, the entire document (including actual lyrics) should be displayed, either on another page or in a pop-out window. Major web search engines such as Google all display snippets for search results. A snippet is a short summary of a document. It can be generated dynamically in a KWIC (keyword in context) style. Brief explanations about snippets can be found in https://nlp.stanford.edu/IR-book/html/htmledition/results-snippets-1.html (or textbook Introduction to Information Retrieval Section 8.7 Results snippets on page 157). While you may design your own simple algorithms, it’s a much better idea to generate snippets using tools/API already provided by Lucene or Elasticsearch. Search “Lucene highlighter” for more information and start from there.
Submission: Prepare a short report in txt format that includes the following information: • URL of the web service. Make sure the URL works until the grade is released. We assume the service works in a straightforward way. Otherwise please include a short instruction on how it works. If you use two separate URLs for Part A and Part B, provide both of them. • Write a short summary in free style describing your implementation, observations and comments. Submit this short report as a txt file to Tracs. Please also submit a separate zip file to Tracs that includes your main source code for verification purposes.
Answered Same Day May 07, 2021

Solution

David answered on May 09 2021
151 Votes
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here