This week, you will begin the planning phase and conduct online research to investigate the...

Question

This week, you will begin the planning phase and conduct online research to investigate the requirements for setting up a data science technology stack and create a PowerPoint presentation that highlights your findings.
Specifically, make sure the following areas are discussed in your presentation:
· Available data warehousing and storage technologies
· Tools used for Extraction, Transformation, and Loading (ETL)
· Technologies that support Business Intelligence
· Visualization tools
· Machine Learning and Analytics Implementation Frameworks
· Deployment Stack (architectures)
· Common data science technology use case scenarios
Discuss the implications of a poorly designed and managed data science technology stack on data science democratization initiatives and planning.
Length: 12-15 PowerPoint slides with notes XXXXXXXXXXwords per slide), not including title and references slides
References: Include a minimum of 5 scholarly references (be sure that at least two of the five are peer-reviewed research studies involving data science technology stack planning and data democratization from the school li
ary to support your ideas).
NB:
…”two of the five are peer-reviewed research studies involving data science technology stack planning and data democratization from the school li
ary….”
I have downloaded 2 peer-reviewed research articles from the school li
ary. It’s the attached PDF files (school_li
ary_post_1.pdf, and school_li
ary_post_2.pdf). Please use them for citation in place of the 2 required peer-reviewed research article.

International Journal of Sport Communication, 2019, 12, 313–335
https:
doi.org/10.1123/ijsc XXXXXXXXXX
© 2019 Human Kinetics, Inc. SCHOLARLY COMMENTARY
The Blockchain Phenomenon:
Conceptualizing Decentralized Networks
and the Value Proposition to the Sport
Industry
Michael L. Naraine
Deakin University, Australia
The sport industry has experienced significant technological change in its
environment with the recent rise of Bitcoin and its underlying foundation,
lockchain. Accordingly, the purpose of this paper is to introduce and conceptu-
ally ground blockchain in sport and discuss the implications and value proposition
of blockchain to the sport industry. After a
ief overview of blockchain and the
technology stack, the mechanism is conceptually rooted in the network paradigm,
a framework already known to the academic sport community. This treatment
argues that the decentralized, closed, and dense mesh network produced by
lockchain technology is beneficial to the sport industry. Notably, the article
identifies blockchain’s capacity to facilitate new sources of revenue and improve
data management and suggests that sport management and communication
consider the value of blockchain and the technology stack as the digital footprint
in the industry intensifies and becomes increasingly complex.
Keywords: Bitcoin, cryptocu
ency, network theory, technology
On December 17, 2017, the price of Bitcoin (BTC) reached an all-time high,
with 1 unit valued at roughly $19,783 U.S. (Mo
is, XXXXXXXXXXThe digital medium of
exchange (also known as a cryptocu
ency) had been originally conceived by a
person or group working under the pseudonym Satoshi Nakamoto in 2008 and
experienced a high degree of volatility since its inception (Maurer, Nelms, &
Swartz, XXXXXXXXXXIn fact, during its infancy, large amounts of BTC were exchanged
for “low-value” goods and services including pizzas and music albums and even
facilitated the online transactions of illegal drugs, weapons, and pornographic
material, much to the dismay of authorities in several jurisdictions. Despite these
seedy beginnings, BTC matured and has experienced an increase in its value,
drawing notice of larger businesses who began accepting the cu
ency as a form of
payment, including casinos, e-commerce sites, and even NCAA Division I bowl
The author is with the Dept. of Management, Faculty of Business and Law, Deakin University,
Burwood, VIC, Australia. Address co
espondence to XXXXXXXXXX.
313
Authenticated XXXXXXXXXX | Downloaded 01/24/20 10:05 AM UTC
https:
doi.org/10.1123/ijsc XXXXXXXXXX
mailto: XXXXXXXXXX
games (Casey, XXXXXXXXXXAlthough the 2017 holiday shopping season might have
helped BTC reach its highest point, it is difficult to challenge its growth trajectory
with year-to-year increases upward of 2,000%.
BTC is just one of several cryptocu
encies in the marketplace, however.
Names such as Ethereum, Litecoin, Dash, and Ripple represent a small sample of
virtual cu
ency offering a digital alternative to traditional forms of value-exchange
mediums (e.g., cash, credit). The expanse and popularity of these cu
encies is
partly attributable to the notion that these “moneys” are not regulated by any
government or central authority and thus cannot be manipulated by political will
(Middle
ook & Hughes, XXXXXXXXXXNonetheless, the infatuation with the value of
cryptocu
encies overshadows key elements of the overall challenge to the
traditional value-exchange paradigm, especially the notion of blockchain and
decentralized networks. While Nakamoto’s XXXXXXXXXXadvancement of BTC as a
cryptocu
ency is certainly novel, the underlying support mechanism known as
lockchain is incredibly nuanced and has led to other decentralized movements
including ride-sharing, microblogging, file-storage, and crowdfunding sites, to
name a few (Swan, XXXXXXXXXXTo wit, more business domains have changed thei
models to incorporate blockchain (Tapscott & Tapscott, 2016, 2017), signaling
that while BTC and cryptocu
encies may be “fadlike,” there is a greater under-
standing required of the underlying technology. Specifically, conceptualizing
lockchain technology and understanding its impact on the sport industry has
not yet occu
ed.
This omission can also be explained on two fronts. First, sport organizations
tend to maintain an inert state and often resist technological changes due to a
knowledge gap and a lack of understanding of the new technology’s impact (Slack
& Parent, XXXXXXXXXXThis sentiment is exemplified by the recent surge of social media
and sport literature in the field (Filo, Lock, & Karg, 2015); although social media
has experienced significant growth over the past decade, sport management via
sport communication is only now realizing the impact of this medium on sport
stakeholders such as professional teams (e.g., Achen, Kaczorowski, Horsmann, &
Ketzler, 2018),
ands (e.g., Geurin & Burch, 2017), and governing bodies
(e.g., Naraine & Parent, 2016a, 2016b). Sport organizations do make incremental,
evolutionary changes such as the virtual assistant referee in global football and
vehicle innovations in Formula One racing, but these are not radical changes to
the operational landscape that Slack and Parent have described. As such, the sport
management and communication academe is generally slow to react and under-
stand new, radical change innovations. Second, the lack of discussion about
lockchain in sport management and communication can be attributed to its
computational complexity. Maintaining the social media analogy, the premise
ehind social networking sites like Facebook and Twitter has been simplified in
sport: Users can interact and share with other users asynchronously and synchro-
nously (Naraine & Karg, XXXXXXXXXXIn this sense, conceptualizing social media does
not require an in-depth understanding of its algorithms and various formulae.
Conversely, blockchain has yet to be tailored for a sport management and
communication audience and, given its association with BTC (Maurer et al.,
2013), there is the potential to be overwhelmed with its mathematical association.
Although this technology has yet to be examined, its presence in the sport
industry is growing. Heitner XXXXXXXXXXreported that more sport enterprises are looking
IJSC Vol. 12, No. 3, 2019
314 Naraine
Authenticated XXXXXXXXXX | Downloaded 01/24/20 10:05 AM UTC
to blockchain technology to expand their customer base globally in a safe,
protected environment, while Martínez XXXXXXXXXXpointed to highly visible profes-
sional athletes like Stephen Cu
y and Jeremy Lin as investors in the blockchain
sector. In addition, associated industries such as tourism have also turned to
lockchain for growth and development (Kwok & Koh, XXXXXXXXXXWith these trends,
lockchain technology is poised to become more pervasive and join social media
as a disruptive force in our industry, altering the present understanding of how
sport organizations generate revenue, store data, and generally digitize the business
management environment (Pegoraro, XXXXXXXXXXMoreover, Ratten and Fe
eira (2016)
argued that maintaining the sport industry’s upward growth trajectory is predicated
on seeking out and adopting innovative approaches. Thus, while sport managers
might hesitate to em
ace this new technology, understanding its disruptive,
innovative nature can help advance the industry further.
As such, the purpose of this treatment is twofold: to introduce and conceptu-
ally ground blockchain in sport and to discuss the implications and value
proposition of blockchain to the sport industry. In order for scholars and practi-
tioners to assess the importance and impact of blockchain technology, it is
imperative that they develop an understanding of what blockchain entails and
how it differs from the existing, accepted paradigm (Charitou & Markides, 2003).
To help with this aim, I begin this treatment with an examination of the blockchain
system, its characteristics, and, ultimately, its relationship to cryptocu
encies like
BTC. After this initial discussion, the merits of blockchain’s decentralized network
system are revealed, juxtaposed to the extant network paradigm literature in sport,
which champions centralization. Finally, the possible applications of blockchain
in the industry are discussed, with a notable emphasis on alternative revenue-
generation schemes (e.g., sport-based tokens) and data-storage repositories.
Blockchain and Technology Stacks
Blockchain Definition
To understand how blockchain could be useful and applied in the sport industry, it
is prudent to define the concept and explain its functions. Blockchain is a
decentralized and transparent recording system; simply put, it is a set of blocks
of information strung together by various transactions over a peer-to-peer network
(Zhao, Fan, & Yan, XXXXXXXXXXThis definition might still be too abstract for the sport
management and communication audience, so further unpacking is required.
The blockchain process is enacted when a transaction is requested (see
Figure 1). That transaction is generally a financial activity—as it is in the case
of BTC and other cryptocu
encies—but could consist of some other action (e.g., a
product moving through the supply chain, data being stored to a database), although
the value-exchange paradigm is a useful, straightforward example of the process.
Once a transaction is requested, a message is
oadcast over a network of peers, a
network of users connected through their phones, tablets, and computers. These
peers begin to validate the transaction request based on specified criteria (i.e., a
series of algorithms)—the technical, mathematical computing that is more relevant
to computer science and less relevant to this discussion. Once the request meets
those criteria and is validated, a piece of a block is created (i.e., known in BTC
IJSC Vol. 12, No. 3, 2019
The Blockchain Phenomenon 315
Authenticated XXXXXXXXXX | Downloaded 01/24/20 10:05 AM UTC
F
ig
u
e
1
—
T
he
l
oc
kc
ha
in
p
oc
es
s.
316 IJSC Vol. 12, No. 3, 2019
Authenticated XXXXXXXXXX | Downloaded 01/24/20 10:05 AM UTC
terminology as a hash). This piece is combined with others to form a whole block.
A whole block is then then placed alongside others to form a string of blocks known
as a ledger (Tapscott & Tapscott, XXXXXXXXXXLedgers are transparent but unalterable,
emoving the risk of manipulation by a user. Once a ledger has been created, the
entire process ceases, with the initial transaction request completed, too.
To a casual observer, this process may seem convoluted and unnecessary, but
its dynamics are what makes it most valuable. In the traditional (but modern) value-
exchange paradigm, one party could transmit money to another party for a product
or service. This requires a centralized actor, such as a central bank, operating as the
conduit between the two parties to ensure the validity of the transaction (i.e., that
the purchasing party is using legal means). However, modern transactions also
have multiple actors between two parties, including retail banks, credit unions,
electronic-transfer facilitators (e.g., Visa, Mastercard), and payment-infrastructure
merchants both traditional (e.g., Interac, Moneris) and emergent (e.g., PayTM,
Alipay). As trusted intermediaries, these actors help determine whether the
purchaser has the funds and ability to expend those funds for goods and services.
Despite their trustworthiness, however, there remain three important considera-
tions with these actors: time, cost, and security. When a buyer purchases a good o
service, that transaction has several layers to confirm the authenticity and transfe
of funds. Most purchases conducted using a credit card require several days to
settle and confirm. There is also a cost factor; as for-profit institutions, these actors
equire revenue, so various fees are applied at various stages of the transaction.
Finally, modern transactions can be susceptible to what is known as the double-
spend problem (Maurer et al., XXXXXXXXXXThis problem arises if a buyer attempts to
make a fraudulent digital exchange and spend the same funds twice. For instance,
while shopping online, a user decides to use a digital cu
ency to pay for a good
from two distinctive retailers. Because the cu
ency is digital, there is the potential
to use the same amount for both transactions, without either retailer knowing about
the fraudulent behavior. While most online retailers have systems to detect
fraudulent transactions on their backends (one of the reasons they are deemed
trustworthy) and only accept payment from electronic-transfer facilitators, it is still
plausible that a user could “hack” the system and attempt two transactions with the
same set of funds simultaneously. The need for a secure backend also speaks to the
costs associated with handling transactions in the modern environment. Thus, these
three considerations underscore the value of blockchain—its intricacies facilitate
quick, cost-effective, and safer transactions. To emphasize this point, let’s conside
BTC. With blockchain technologies, the time required to initiate and complete a
transaction is

datadisc-c5pk1hnh.docx schoollibrarypost1-tdxfidir.pdf schoollibrarypost2-q0a2ejo3.pdf

Shubham · Accepted Answer

SETTING UP A DATA SCIENCE TECHNOLOGY STACK
SETTING UP A DATA SCIENCE TECHNOLOGY STACK
Available data warehousing and storage technologies 
A data warehouse is a key component of a data science technology stack, providing a centralized repository of structured and organized data that can be easily accessed and analyzed by data scientists, analysts, and other stakeholders. A data warehouse is a large, centralized repository of data that is specifically designed to support business intelligence, reporting, and analytics activities. It is used to store structured, historical data from various sources and transform it into a format that is useful for analysis and decision-making.
Amazon Redshift: a fully managed, petabyte-scale data warehouse service that offers high performance and scalability. It integrates well with other Amazon Web Services (AWS) products and offers a range of security features (Naraine, 2019).
Google BigQuery: a serverless, cloud-based data warehouse service that can process large amounts of data quickly and efficiently. It integrates well with other Google Cloud Platform services and offers a range of security features.
Snowflake: a cloud-based data warehousing platform that offers high performance and scalability, as well as a range of security features. It integrates well with other cloud-based services and supports multiple data sources.
Microsoft Azure Synapse Analytics: a cloud-based analytics service that offers both data warehousing and big data analytics capabilities. It integrates well with other Microsoft Azure services and offers a range of security features.
A data warehouse is a key component of a data science technology stack, providing a centralized repository of structured and organized data that can be easily accessed and analyzed by data scientists, analysts, and other stakeholders. A data warehouse is a large, centralized repository of data that is specifically designed to support business intelligence, reporting, and analytics activities. It is used to store structured, historical data from various sources and transform it into a format that is useful for analysis and decision-making.
Amazon Redshift: a fully managed, petabyte-scale data warehouse service that offers high performance and scalability. It integrates well with other Amazon Web Services (AWS) products and offers a range of security features.
Google BigQuery: a serverless, cloud-based data warehouse service that can process large amounts of data quickly and efficiently. It integrates well with other Google Cloud Platform services and offers a range of security features.
Snowflake: a cloud-based data warehousing platform that offers high performance and scalability, as well as a range of security features. It integrates well with other cloud-based services and supports multiple data sources.
Microsoft Azure Synapse Analytics: a cloud-based analytics service that offers both data warehousing and big data analytics capabilities. It integrates well with other Microsoft Azure services and offers a range of security features.
2
Available data warehousing and storage technologies (Contd..)
When setting up a data science technology stack, choosing the right storage technology is crucial to ensure that data is easily accessible, secure, and scalable. There are several storage technologies available for storing data in a data science technology stack.
Relational databases: Relational databases are the most common type of database used in data science technology stacks. They are designed to store structured data in tables, with each table having a predefined schema. Examples include MySQL, PostgreSQL, and Microsoft SQL Server.
NoSQL databases: NoSQL databases are designed to store unstructured or semi-structured data, making them ideal for storing large volumes of data that don't fit neatly into tables. Examples include MongoDB, Cassandra, and Couchbase (Hird, Kariyeva & McDermid, 2021).
Object storage: Object storage is a type of data storage that is used to store unstructured data, such as images, videos, and documents. Object storage systems use a flat address space to store data, making it easier to scale and access data quickly. Examples include Amazon S3, Microsoft Azure Blob Storage, and Google Cloud Storage.
Data lakes: Data lakes are large-scale storage repositories that allow organizations to store all types of data in their original format, without having to convert it into a structured format first. Data lakes are typically built on top of object storage systems and are used to store large volumes of data that can be used for data science and analytics.
When setting up a data science technology stack, choosing the right storage technology is crucial to ensure that data is easily accessible, secure, and scalable. There are several storage technologies available for storing data in a data science technology stack.
Relational databases: Relational databases are the most common type of database used in data science technology stacks. They are designed to store structured data in tables, with each table having a predefined schema. Examples include MySQL, PostgreSQL, and Microsoft SQL Server.
NoSQL databases: NoSQL databases are designed to store unstructured or semi-structured data, making them ideal for storing large volumes of data that don't fit neatly into tables. Examples include MongoDB, Cassandra, and Couchbase.
Object storage: Object storage is a type of data storage that is used to store unstructured data, such as images, videos, and documents. Object storage systems use a flat address space to store data, making it easier to scale and access data quickly. Examples include Amazon S3, Microsoft Azure Blob Storage, and Google Cloud Storage.
Data lakes: Data lakes are large-scale storage repositories that allow organizations to store all types of data in their original format, without having to convert it into a structured format first. Data lakes are typically built on top of object storage systems and are used to store large volumes of data that can be used for data science and analytics.
3
Tools used for Extraction, Transformation, and Loading
Extract, Transform, Load (ETL) is a critical process in setting up a data science technology stack. It involves extracting data from various sources, transforming it into a format suitable for analysis, and loading it into a data warehouse or data lake. This process is vital to ensure that the data used for analysis is accurate, consistent, and reliable. There are several tools available for ETL, each with its own strengths and weaknesses. 
Apache Spark: Apache Spark is a popular open-source distributed computing system that is commonly used for large-scale data processing. It provides various modules for ETL, such as Spark SQL, Spark Streaming, and Spark MLlib. Spark's built-in support for distributed data processing and machine learning makes it an ideal choice for data science projects (Raschka, Patterson & Nolet, 2020).
Apache NiFi: Apache NiFi is an open-source data integration tool that is used for data routing, transformation, and system mediation. It provides a user-friendly web interface that makes it easy to create, monitor, and manage ETL pipelines. NiFi is designed to handle data in real-time, making it an excellent choice for streaming data processing.
Talend: Talend is a powerful open-source data integration tool that offers a comprehensive set of ETL features. It supports various data sources, including databases, flat files, and cloud storage. Talend provides a user-friendly drag-and-drop interface for building ETL pipelines, making it easy for non-technical users to create complex data integration workflows.
Extract, Transform, Load (ETL) is a critical process in setting up a data science technology stack. It involves extracting data from various sources, transforming it into a format suitable for analysis, and loading it into a data warehouse or data lake. This process is vital to ensure that the data used for analysis is accurate, consistent, and reliable. There are several tools available for ETL, each with its own strengths and weaknesses. 
Apache Spark: Apache Spark is a popular open-source distributed computing system that is commonly used for large-scale data processing. It provides various modules for ETL, such as Spark SQL, Spark Streaming, and Spark MLlib. Spark's built-in support for distributed data processing and machine learning makes it an ideal choice for data science projects.
Apache NiFi: Apache NiFi is an open-source data integration tool that is used for data routing, transformation, and system mediation. It provides a user-friendly web interface that makes it easy to create, monitor, and manage ETL pipelines. NiFi is designed to handle data in real-time, making it an excellent choice for streaming data processing.
Talend: Talend is a powerful open-source data integration tool that offers a comprehensive set of ETL features. It supports various data sources, including databases, flat files, and cloud storage. Talend provides a user-friendly drag-and-drop interface for building ETL pipelines, making it easy for non-technical users to create complex data integration workflows.
4
Tools used for Extraction, Transformation, and Loading (Contd..)
Apache Airflow: Apache Airflow is an open-source platform used to programmatically author, schedule, and monitor workflows. It provides a flexible and extensible architecture that makes it easy to integrate with various data sources and tools. Airflow's support for dynamic workflows and DAGs (Directed Acyclic Graphs) makes it an ideal choice for complex ETL pipelines.
AWS Glue: AWS Glue is a fully-managed ETL service provided by Amazon Web Services (AWS). It supports various data sources and provides a serverless architecture, making it easy to scale up or down based on demand. Glue provides a visual interface for creating ETL workflows and integrates seamlessly with other AWS services such as S3, Redshift, and EMR (Schatz et al. 2022).
Selecting the right ETL tool is crucial for setting up a data science technology stack. Tools has its strengths and weaknesses, and the choice ultimately depends on the specific requirements of the project. It is recommended to evaluate each tool's features and functionality before making a decision to ensure that it meets the project's needs.
Apache Airflow: Apache Airflow is an open-source platform used to programmatically author, schedule, and monitor workflows. It provides a flexible and extensible architecture that makes it easy to integrate with various data sources and tools. Airflow's support for dynamic workflows and DAGs (Directed Acyclic Graphs) makes it an ideal choice for complex ETL pipelines.
AWS Glue: AWS Glue is a fully-managed ETL service provided by Amazon Web Services (AWS). It supports various data sources and provides a serverless architecture, making it easy to scale up or down based on demand. Glue provides a visual interface for creating ETL workflows and integrates seamlessly with other AWS services such as S3, Redshift, and EMR.
Selecting the right ETL tool is crucial for setting up a data science technology stack. Tools has its strengths and weaknesses, and the choice ultimately depends on the specific requirements of the project. It is recommended to evaluate each tool's features and functionality before making a decision to ensure that it meets the project's needs.
5
Technologies that support Business Intelligence
Business Intelligence is an essential component of a data science technology stack, providing insights into an organization's data to make better business decisions. It provides critical insights into an organization's data, enabling better business decision-making. BI tools and technologies can help organizations collect, process, analyze, and visualize data, providing actionable insights into business performance, customer behavior, market trends, and other key indicators. It provides the infrastructure for analysis of the data and it provides the user-friendly approach for presentation of the information.
Tableau: Tableau is a popular data visualization tool that provides a user-friendly interface for creating interactive dashboards, reports, and charts. It supports various data sources, including databases, cloud storage, and spreadsheets.

This week, you will begin the planning phase and conduct online research to investigate the requirements for setting up a data science technology stack and create a PowerPoint presentation that...

Solution

Answer To This Question Is Available To Download

Related Questions & Answers

Submit New Assignment