Great Deal! Get Instant $10 FREE in Account on First Order + 10% Cashback on Every Order Order Now

Database Development Read the following articles (articles are provided as attachments): · Dual Assessment of Data Quality in Customer Databases, Journal of Data and Information Quality (JDIQ), Volume...

1 answer below »
Database Development 
Read the following articles (articles are provided as attachments):
· Dual Assessment of Data Quality in Customer Databases, Journal of Data and Information Quality (JDIQ), Volume 1 Issue 3, December 2009, Adir Even, G. Shankaranarayanan.
· Process-centered review of object oriented software development methodologies, ACM Computing Surveys (CSUR), Volume 40 Issue 1, Fe
uary 2008, Raman Ramsin, and Richard F. Paige.
Write a two to three (3) page paper in which you:
1. Recommend at least three (3) specific tasks that could be performed to improve the quality of datasets, using the Software Development Life Cycle (SDLC) methodology. Include a thorough description of each activity per each of the (7) phases listed below:
· Planning (Project Initiation)
· Requirements (Analysis)
· Design and Prototyping
· Software Development (Implementation)
· Testing and Integration
· Development
· Operations and Maintenance
2. Recommend the actions that should be performed in order to optimize record selections and to improve database performance from a quantitative data quality assessment (this is a two part question).
3. Suggest three (3) maintenance plans and three (3) activities that could be performed in order to improve data quality.
4. From the software development methodologies described in the article titled, “Process-centered Review of Object Oriented Software Development Methodologies,” complete the following.
a. Evaluate which method would be efficient for planning proactive concu
ency control methods and lock granularities. Assess how your selected method can be used to minimize the database security risks that may occur within a multiuser environment. 
. Analyze how the method can be used to plan out the system effectively and ensure that the number of transactions does not produce record-level locking while the database is in operation.
Your assignment must follow these formatting requirements:
· Be typed, double-spaced, using Times New Roman font (size 12), with one-inch margins on all sides; citations and references must follow APA format.
· Include a cover page containing the title and the date. The cover page and the reference page are not included in the required page length.

JDQ00014.dvi
15
Dual Assessment of Data Quality
in Customer Databases
ADIR EVEN
Ben-Gurion University of the Negev
and
G. SHANKARANARAYANAN
Babson College
Quantitative assessment of data quality is critical for identifying the presence of data defects and
the extent of the damage due to these defects. Quantitative assessment can help define realis-
tic quality improvement targets, track progress, evaluate the impacts of different solutions, and
prioritize improvement efforts accordingly. This study describes a methodology for quantitatively
assessing both impartial and contextual data quality in large datasets. Impartial assessment mea-
sures the extent to which a dataset is defective, independent of the context in which that dataset
is used. Contextual assessment, as defined in this study, measures the extent to which the pres-
ence of defects reduces a dataset’s utility, the benefits gained by using that dataset in a specific
context. The dual assessment methodology is demonstrated in the context of Customer Relation-
ship Management (CRM), using large data samples from real-world datasets. The results from
comparing the two assessments offer important insights for directing quality maintenance efforts
and prioritizing quality improvement solutions for this dataset. The study describes the steps and
the computation involved in the dual-assessment methodology and discusses the implications fo
applying the methodology in other business contexts and data environments.
Categories and Subject Descriptors: E.m [Data]: Miscellaneous
General Terms: Economics, Management, Measurement
Additional Key Words and Phrases: Data quality, databases, total data quality management,
information value, customer relationship management, CRM
ACM Reference Format:
Even, A. and Shankaranarayanan, G XXXXXXXXXXDual assessment of data quality in custome
databases. ACM J. Data Inform. Quality 1, 3, Article 15 (December 2009), 29 pages. DOI =
10.1145/ XXXXXXXXXXhttp:
doi.acm.org/10.1145/ XXXXXXXXXX.
Authors’ addresses: A. Even, Department of Industrial Engineering and Management (IEM),
Ben-Gurion University of the Negev, Beer-Sheva, 84105, Israel; email: XXXXXXXXXX;
G. Shankaranarayanan (co
esponding author), Technology, Operations, and Information Man-
agement (TOIM), Babson College, Babson Park, MA XXXXXXXXXX; email: XXXXXXXXXX.
Permission to make digital or hard copies part or all of this work for personal or classroom use
is granted without fee provided that copies are not made or distributed for profit or commercial
advantage and that copies show this notice on the first page or initial screen of a display along
with the full citation. Copyrights for components of this work owned by others than ACM must
e honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on
servers, to redistribute to lists, or to use any component of this work in other works requires
prior specific permission and/or a fee. Permissions may be requested from the Publications Dept.,
ACM, Inc., 2 Penn Plaza, Suite 701, New York, NY XXXXXXXXXXUSA, fax XXXXXXXXXX, o
XXXXXXXXXX.
c© 2009 ACM XXXXXXXXXX/2009/12-ART15 $10.00 DOI: XXXXXXXXXX/ XXXXXXXXXX.
http:
doi.acm.org/10.1145/ XXXXXXXXXX.
ACM Journal of Data and Information Quality, Vol. 1, No. 3, Article 15, Pub. date: December 2009.
15: 2 · A. Even and G. Shankaranarayanan
1. INTRODUCTION
High-quality data makes organizational data resources more usable and, con-
sequently, increases the business benefits gained from using them. It con-
tributes to efficient and effective business operations, improved decision mak-
ing, and increased trust in information systems [DeLone and McLean 1992;
Redman 1996]. Advances in information systems and technology permit orga-
nizations to collect large amounts of data and to build and manage complex
data resources. Organizations gain competitive advantage by using these re-
sources to enhance business processes, develop analytics, and acquire business
intelligence [Davenport 2006]. The size and complexity make data resources
vulnerable to data defects that reduce their data quality. Detecting defects and
improving quality is expensive, and when the targeted quality level is high, the
costs often negate the benefits. Given the economic trade-offs in achieving and
sustaining high data quality, this study suggests a novel economic perspective
for data quality management. The methodology for dual assessment of qual-
ity in datasets described here accounts for the presence of data defects in that
dataset, assuming that costs for improving quality increase with the numbe
of defects. It also accounts for the impact of defects on benefits gained from
using that dataset.
Quantitative assessment of quality is critical in large data environments, as
it can help set up realistic quality improvement targets, track progress, assess
impacts of different solutions, and prioritize improvement efforts accordingly.
Data quality is typically assessed along multiple quality dimensions (e.g.,
accuracy, completeness, and cu
ency), each reflecting a different type of qual-
ity defect [Wang and Strong 1996]. Literature has described several methods
for assessing data quality and the resulting quality measurements often ad-
here to a scale between 0 (poor) and 1 (perfect) [Wang et al. 1995; Redman
1996; Pipino et al. 2002]. Some methods, refe
ed to by Ballou and Pazer [2003]
as structure-based or structural, are driven by physical characteristics of the
data (e.g., item counts, time tags, or defect rates). Such methods are impar-
tial as they assume an objective quality standard and disregard the context in
which the data is used. We interpret these measurement methods as reflecting
the presence of quality defects (e.g., missing values, invalid data items, and in-
co
ect calculations). The extent of the presence of quality defects in a dataset,
the impartial quality, is typically measured as the ratio of the number of
nondefective records and the total number of records. For example, in the sam-
ple dataset shown in Table I, let us assume that no contact information is avail-
able for customer A. Only 1 out of 4 records in this dataset has missing values;
hence, an impartial measurement of its completeness would be (4−1)/4 = 0.75.
Other measurement methods, refe
ed to as content-based [Ballou and
Pazer 2003], derive the measurement from data content. Such measurements
typically reflect the impact of quality defects within a specific usage context
and are also called contextual assessments [Pipino et al. 2002]. Data-quality
literature has stressed the importance of contextual assessments as the im-
pact of defects can vary depending on the context [Jarke et al. 2002; Fishe
et al. 2003]. However, literature does not minimize the importance of impartial
ACM Journal of Data and Information Quality, Vol. 1, No. 3, Article 15, Pub. date: December 2009.
Dual Assessment of Data Quality in Customer Databases · 15: 3
Table I. Sample Dataset
assessments. In certain cases, the same dimension can be measured both
impartially and contextually, depending on the purpose [Pipino et al. 2002].
Given the example in Table I, let us first consider a usage context that exam-
ines the promotion of educational loans for dependent children. In this context,
the records that matter the most are the ones co
esponding to customers B
and D: families with many children and relatively low income. These records
have no missing values and hence, for this context, the dataset may be consid-
ered complete (i.e., a completeness score of 1). For another usage context that
promotes luxury vacation packages, the records that matter the most are those
co
esponding to customers with relatively higher income, A and C. Since 1
out of these 2 records is defective (record A is missing contact), the complete-
ness of this dataset for this usage context is only 0.5.
In this study we describe a methodology for the dual assessment of quality;
dual, as it assesses quality both impartially and contextually and draws con-
clusions and insights from comparing the two assessments. Our objective is
to show that the dual perspective can enhance quality assessments and help
direct and prioritize quality improvement efforts. This is particularly true in
large and complex data environments in which such efforts are associated with
significant cost-benefit trade-offs. From an economic viewpoint, we suggest
that impartial assessments can be linked to costs. The higher the number of
defects in a dataset, the more is the effort and time needed to fix it and the
higher the cost for improving the quality of this dataset. On the other hand,
depending on the context of use, improving quality differentially affects the
usability of the dataset. Hence, we suggest that contextual assessment can be
associated with the benefits gained by improving data quality. To underscore
this differentiation, in our example (Table I), the impartial assessment indi-
cates that 25% of the dataset is defective. Co
ecting each defect would cost
the same, regardless of the context of use. However, the benefits gained by cor-
ecting these defects may vary, depending on the context of use. In
Answered Same Day Dec 05, 2021

Solution

Ritu answered on Dec 07 2021
145 Votes
Full Title of Your Paper Here
1
DATABASE DEVELOPMENT                
2
DATABASE DEVELOPMENT    
Database Development
Recommend at least three (3) specific tasks that could be performed to improve the quality of datasets, using the Software Development Life Cycle (SDLC) methodology. Include a thorough description of each activity per each phase
SDLC is a short form of the system development cycle used to create or edit software projects. Each method describes a unique way to create new software modules or programs. It is known as the SDLC development life cycle, which contains five techniques or steps: System Design Detailed Design Analysis Planning for Use and Maintenance    
There are multiple ways to improve the quality of the dataset using the SDLC method. Some of them are:
・Eliminate duplicates of data
・ Input check
・False Null Disposal
・ ensures that data values are within a given domain.
・Resolving data conflicts
・Create confidence in proper classification including data usage
・Establish and sound
Nulls sweep of false value: An unknown value indicates a null value, including a value that is different from a non-zero value. Null values vary. Null normally specifies unknown data that cannot be used or later. For example, a customer middle default name may not be unknown at the time of the customer's order. "When null value is contained in the logical provider information and comparison can return an unknown third of the results instead of true or false. It is highly recommended that you minimize the use of your query value and the data modifications to the null value (SQL server, 2000).
Ensuring that the data value is within the given domain: As designed by the specification, make sure all fields have the required information.
Resolve conflict in data: The amount of useful data available on the web has grown rapidly in recent years.
Recommend the actions that should be performed in order to optimize record selections as well as to improve database performance from a quantitative data quality assessment
The following should be executed to improve database selection and optimize database performance from the quantitative data quality assessment:
Protection: When the Dead locked probability is deleted, all new lock transactions will be canceled in the event the transaction aborted all changes made by this transaction will be reversed and all locks...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here