Assessment Brief
School of Psychology and Computer Science
UCLan Coursework Assessment Brief
XXXXXXXXXX
Module Title: Programming
Module Code: CO1401
Level 4
Word Filte
This assessment is worth 100% of the overall module mark
THE BRIEF/INSTRUCTIONS
This assignment was inspired by a report by the BBC about an “overzealous profanity checker’ on Virgin Media. Your assignment is to produce a program using C++ which will read in a list of banned words from a file, as well as an additional text file which will then be filtered using these banned words.
The original BBC article can be found here: http:
www.
c.co.uk/news/technology XXXXXXXXXX
This is an individual project and no group work is permitted.
You will be assessed on your implementation of the solution which must be produced using C++.
Do not diverge from the assignment specification. If you do not conform to the assignment specification then you will lose marks.
You may conduct your own research into topics which have not been explicitly taught within the module (this will be required for higher marks) however, you should include a justification for its use along side links to any sources in your program comments. Failure to do so may result in a plagiarism investigation. Only use concepts that you are confident you understand.
Learning Outcomes
This assessment has been designed to assess the following learning outcomes:
· Apply the principles of programming.
· Design an appropriate solution for a given problem.
· Implement a readable and maintainable software solution of their own design.
· Evaluate the quality of his or her developed software.
Marking Scheme
Use the marking criteria provided to guide your design for your solution. Please follow this carefully.
Marking bands are indicative and can be ove
idden at the marker’s discretion with justification.
Pass Criteria:
· 10 marks: The files ‘banned.txt’ and ‘text1.txt’ are successfully read into the program.
· 20 marks: Perform a comparison between the words from ‘banned.txt’ and the words from ‘text1.txt’.
Hint: it will be easiest to read the banned words from the file into an a
ay.
You can then read the words from ‘text1.txt’ and compare them with the words in the a
ay.
· 10 marks: Display how many times each banned word has been found in ‘text1.txt’ on the screen.
i.e.
‘dog’ found 0 times
‘cat’ found 3 times
Continued on next page…
3rd Criteria:
· Up to 10 marks will be given for following good programming practices:
· Your code must be properly indented and laid out so that it is readable.
· Brackets must line up (and should normally be on a line of their own).
· Indentation must be consistent.
· Appropriate use of white space should be made.
· Over-long lines of code or comments should be split up.
· You should have no ‘magic numbers’ but instead make proper use of constants.
· Variable names should be meaningful and no excessively long.
· Your code should be commented appropriated.
2:2 Criteria:
· 3 marks: Filter the text from ‘text1.txt’ by comparing every word with the list of banned words.
If you find any word from the banned list then you must replace it with *** (3 asterisks).
· 2 marks: Write the filtered text to an output file: ‘text1Filtered.txt’.
· 5 marks: Use functions to sperate your code into sensible, reusable parts.
Comment these functions appropriately.
2:1 Criteria:
· 5 marks: Read in ‘text2.txt’, ‘text3.txt’ and ‘text4.txt’.
You should extend your text filtration to these files - this will be more difficult
as they contain punctuation and upper case characters.
Write your filtered text to separate output files so ‘text1.txt’ is written to ‘text1Filtered.txt’, ‘text2.txt’ is written to ‘text2Filtered.txt’ etc.
· 3 marks: Filter all instances of the banned words, including instances where the banned word occurs inside another word, e.g. one of the banned words is ‘cat’ so ‘catalogue’ is banned because ‘cat’ occurs inside the word ‘catalogue’. For this level it is acceptable to replace the whole word (i.e. catalogue)
with *** (3 asterisk).
· 2 marks: Update your filtration function so it is able to handle both uppercase and
lowercase letters. i.e. ‘Cat’ is banned as well as ‘cat’.
1st Criteria:
· 5 marks: Replace all occu
ences of banned word with the co
ect number of asterisk,
e.g. "cat" becomes "***", whilst "classification" becomes "classifi***ion".
· 3 marks: Display the 10 most frequent words from each file, and for all files combined on the screen.
· 2 marks: Sort the top 10 words lists into alphabetical orde
High 1st Criteria:
· 5 marks: Instead of replacing the whole of the banned words with asterisks you
should only replace the middle character with an asterisk.
i.e. end characters left unchanged, e.g.. "cat" becomes "c*t" whilst "classification" becomes "classific*tion".
· 5 marks: Provide a comprehensive statistical analysis of each text file:
· Frequency of each word.
· Analysis of the word length (frequency of each word of a particular length and mean (average) length of the words).
· Frequency count of each letter and the number of times each
anned word was found, both as a whole and as a sub-string (within another word).
· 5 marks: Sort all words from each text file into a single file called ‘sorted.txt’
· 5 marks: Add any extra features you wish – i.e. use of classes and/or vectors instead of a
ays.
PREPARATION FOR THE ASSESSMENT
Before attempting this assessment, it is highly recommended that you revisit the "Four L's”:
· Lectures – This includes the slides, notes and recording.
· Lecture notes – Any notes you took during the lectures.
· Lab worksheets – Read over all lab worksheets.
· Lab projects – Ensure all projects have at least stage one implemented.
Combined these provide all the necessary information for you to successfully complete this assessment. All resources are available on the CO1401 Blackboard area under Module Materials.
RELEASE DATES AND HAND IN DEADLINE
Assessment Release date: 22/02/2021