What Is Data Analytics?

Data is being created faster than ever—but are you getting the most from the data you’re collecting?

Understanding data at a deep level is critical to building a successful organization. Data analytics is the process by which raw data becomes usable knowledge that can be acted on. Intel® technology works at every stage of the data pipeline to make it easier for organizations to collect and analyze data for practically any purpose.

For businesses and organizations of all kinds, transforming data into actionable intelligence can mean the difference between struggling and thriving. Maximizing the value of information requires data analytics: the process by which raw data is analyzed to reach conclusions.

While almost every organization analyzes some data, modern analytics enables an unprecedented level of understanding and insight. How far has your company gone toward a data-led, analytics-driven culture—and what’s the next step?

It all starts with the data pipeline.

Understanding the Data Pipeline

Establishing a well-developed data analytics approach is an evolutionary process requiring time and commitment. For organizations that want to take the next step, it’s critical to understand the data pipeline and the life cycle of data going through that pipeline.

  • Ingest: Data Collection
    The first stage of the data pipeline is ingestion. During this stage, data is collected from sources and moved into a system where it can be stored. Data may be collected as a continuous stream or as a series of discrete events.

    For most unstructured data—IDC estimates 80 to 90 percent1—ingestion is both the beginning and the end of the data life cycle. This information, called “dark data,” is ingested but never analyzed or used to impact the rest of the organization.

    Today, one of the biggest advanced data analytics trends starts right at the ingestion stage. In these cases, real-time analytics of streaming data happens alongside the ingestion process. This is known as edge analytics, and it requires high compute performance with low power consumption. Edge analytics often involves IoT devices and sensors gathering information from devices, including factory machines, city streetlights, agricultural equipment, or other connected things.

  • Prepare: Data Processing
    The next stage of the data pipeline prepares the data for use and stores information in a system accessible by users and applications. To maximize data quality, it must be cleaned and transformed into information that can be easily accessed and queried.

    Typically, information is prepared and stored in a database. Different types of databases are used to understand and analyze data in different formats and for different purposes. SQL* relational database management systems, like SAP HANA* or Oracle DB*, typically handle structured data sets. This may include financial information, credential verification, or order tracking. Unstructured data workloads and real-time analytics are more likely to use NoSQL* databases like Cassandra and HBase.

    Optimizing this stage of the data pipeline requires compute and memory performance, as well as data management, for faster queries. It also calls for scalability to accommodate high volumes of data. Data can be stored and tiered according to urgency and usefulness, so that the most-critical data can be accessed with the highest speed.

    Intel® technologies power some of today’s most storage- and memory-intensive database use cases. With Intel® Optane™ Solid State Drives, Alibaba Cloud* was able to provide 100 TB of storage capacity for each POLARDB instance.

  • Analyze: Data Modeling
    In the next stage of the data pipeline, stored data is analyzed, and modeling algorithms are created. Data may be analyzed by an end-to-end analytics platform like SAP, Oracle, or SAS—or processed at scale by tools like Apache Spark*.

    Accelerating and reducing costs for this phase of the data pipeline is critical for a competitive advantage. Libraries and toolkits can cut development time and cost. Meanwhile, hardware and software optimizations can help keep server and data center costs down while improving response time.

    Technologies like in-memory analytics can enhance data analytics capabilities and make analytics investments more cost-effective. With Intel, chemical company Evonik achieved 17x faster restarts for SAP HANA* data tables.2

  • Act: Decision-Making
    After data has been ingested, prepared, and analyzed, it’s ready to be acted upon. Data visualization and reporting help communicate the results of analytics.

    Traditionally, interpretation by data scientists or analysts has been required to transform these results into business intelligence that can be more broadly acted on. However, businesses have begun using AI to automate actions—like sending a maintenance crew or changing a room’s temperature—based on analytics.

For a more in-depth resource about the data pipeline and how organizations can evolve their analytics capabilities, read our e-book From Data to Insights: Maximizing Your Data Pipeline.

How far has your company gone toward a data-led, analytics-driven culture—and what’s the next step?

The Four Types of Data Analytics

Data analytics can be divided into four basic types: descriptive analytics, diagnostic analytics, predictive analytics, and prescriptive analytics. These are steps toward analytics maturity, with each step shortening the distance between the “analyze” and “act” phases of the data pipeline.

  • Descriptive Analytics
    Descriptive analytics is used to summarize and visualize historical data. In other words, it tells organizations what has already happened.
    The simplest type of analysis, descriptive analytics can be as basic as a chart analyzing last year’s sales figures. Every analytics effort depends on a firm foundation of descriptive analytics. Many businesses still rely primarily on this form of analytics, which includes dashboards, data visualizations, and reporting tools.

  • Diagnostic Analytics
    As analytics efforts mature, organizations start asking tougher questions of their historical data. Diagnostic analytics examines not just what happened, but why it happened. To perform diagnostic analytics, analysts need to be able to make detailed queries to identify trends and causation.
    Using diagnostic analytics, new relationships between variables may be discovered: for a sports apparel company, rising sales figures in the Midwest may correlate with sunny weather. Diagnostic analytics matches data to patterns and works to explain anomalous or outlier data.

  • Predictive Analytics
    While the first two types of analytics examined historical data, both predictive analytics and prescriptive analytics look to the future. Predictive analytics creates a forecast of likely outcomes based on identified trends and statistical models derived from historical data.
    Building a predictive analytics strategy requires model building and validation to create optimized simulations, so that business decision‒makers can achieve the best outcomes. Machine learning is commonly employed for predictive analytics, training models on highly scaled data sets to generate more- intelligent predictions.

  • Prescriptive Analytics
    Another advanced type of analytics is prescriptive analytics. With prescriptive analytics, which recommends the best solution based on predictive analytics, the evolution toward true data-driven decision-making is complete.
    Prescriptive analytics relies heavily on machine learning analytics and neural networks. These workloads run on high performance compute and memory. This type of analytics requires a firm foundation based on the other three types of analytics and can be executed only by companies with a highly evolved analytics strategy that are willing to commit significant resources to the effort.

Data Analytics Use Cases

Intel® technology is changing the way modern enterprise organizations do analytics. With use cases that span many industries—and the globe—Intel works to continuously drive analytics forward while helping businesses optimize for performance and cost-effectiveness.

  • Manufacturing
    For automakers, quality control saves money—and lives. At Audi’s automated factory, analysts used sampling to ensure weld quality. Using predictive analytics at the edge, built on Intel’s Industrial Edge Insights Software, the manufacturer can automatically check every weld, on every car, and predict weld problems based on sensor readings when the weld was made.

  • Healthcare
    Training AI to read chest X-rays can help patients and providers get a diagnosis faster. Using Intel® Xeon® Scalable processors to power a neural network, research organization SURF reduced training time from one month to six hours while improving accuracy.

  • Telecommunications
    Smartphones and mobile internet have created unprecedented amounts of mobile data. To enhance customer experiences, telecommunications company Bharati Airtel deployed advanced network analytics using Intel® Xeon® processors and Intel® SSDs to detect and correct network problems faster.

Intel® Technologies for Analytics

With a broad ecosystem of technologies and partners to help businesses create the solutions of tomorrow, Intel powers advanced analytics for enterprises worldwide. From the data center to the edge, Intel works at every point in the analytics ecosystem to deliver maximum value and performance.

  • Intel® Xeon® Scalable processors make it possible to analyze massive amounts of data at fast speeds, whether at the edge, in the data center, or in the cloud.
  • Intel® Optane™ technology represents a revolutionary approach to memory and storage that helps overcome bottlenecks in how data is moved and stored.
  • Intel® FPGAs provide acceleration within the data center to improve response times.
  • Intel® Select Solutions are verified for optimal performance, eliminating guesswork and accelerating solution deployment.

FAQs

Frequently Asked Questions

Data analytics is the process by which information moves from raw data to insights that can be acted on by the business.

Big data analytics uses highly scaled sets of data to uncover new relationships and better understand large amounts of information.

Advanced analytics is not a specific technology or set of technologies. It’s a classification for use cases and solutions that make use of advanced technologies like machine learning, augmented analytics, and neural networks.

Data analytics is used to produce business intelligence that can help organizations make sense of past events, predict future events, and plan courses of action.

The four stages of the data pipeline are ingest, prepare, analyze, and act.

Descriptive and diagnostic analytics both look to the past. Descriptive analytics answers the question of what happened, while diagnostic analytics examines why it happened.

Descriptive analytics looks into the past to say what has already happened and is the foundation of all other types of analytics. Prescriptive analytics makes recommendations for action based on existing data and predictive algorithms.

Predictive and prescriptive analytics both generate insights about the future. Predictive analytics creates a forecast about predicted events, and prescriptive analytics recommends a course of action based on those predictions.

Predictive analytics is used to better anticipate future events. Predictive analysis can identify maintenance needs before they develop or assess the most likely impact of economic conditions to future sales forecasts.

Related Content

Learn more about Intel® technologies for analytics.

Data Analytics

Find out how analytics can help organizations deliver reliable, actionable insight and how to evolve your analytics strategy.

Get the most from analytics

Advanced Data Analytics

Smarter businesses start with advanced analytics. Learn how to get ahead in a data-driven marketplace with Intel® technologies.

Drive smarter analytics strategy

Machine Learning Analytics

Get deeper insights at faster speeds using machine learning and artificial intelligence to boost analytics efforts.

Unleash your full potential

Predictive Analytics

Harness your data to gain a competitive advantage by making actionable predictions about the future.

See further with predictive analytics

Notices & Disclaimers
Intel® technologies may require enabled hardware, software or service activation. // No product or component can be absolutely secure. // Your costs and results may vary. // Intel does not control or audit third-party data. You should consult other sources to evaluate accuracy.

Product and Performance Information

1“What Your Data Isn’t Telling You: Dark Data Presents Problems And Opportunities For Big Businesses,” Forbes, June 2019, forbes.com/sites/marymeehan/2019/06/04/what-your-data-isnt-telling-you-dark-data-presents-problems-and-opportunities-for-big-businesses/#3086fe21484e.
2SAP HANA* simulated workload for SAP BW edition for SAP HANA* Standard Application Benchmark Version 2 as of 30 May 2018. Software and workloads used in performance tests may have been optimized for performance only on Intel® microprocessors. Performance tests, such as SYSmark* and MobileMark, are measured using specific computer systems, components, software, operations, and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information, visit www.intel.co.uk/benchmarks. Performance results are based on testing as of dates shown in configurations and may not reflect all publicly available updates. See backup for configuration details. No product or component can be absolutely secure. Baseline configuration with traditional DRAM: Lenovo ThinkSystem SR950 server with 8x Intel® Xeon® Platinum 8176M processors (28 cores, 165W, 2.1 GHz). Total memory consists of 48x 16 GB TruDDR4 2,666 MHz RDIMMs and 5x ThinkSystem 2.5-inch PM1633a 3.84 TB capacity SAS 12 GB hot-swap solid-state drives (SSDs) for SAP HANA* storage. The operating system is SUSE Linux Enterprise Server 12* SP3 and uses SAP HANA* 2.0 SPS 03 with a 6 TB data set. Average start time for all data finished after table preload for 10 iterations: 50 minutes. New configuration with a combination of DRAM and Intel® Optane™ DC persistent memory: Intel Lightning Ridge SDP with 4x CXL QQ89 AO processor (24 cores, 165W, 2.20 GHz). Total memory consists of 24x 32 GB DDR4 2666 MHz and 24x 128 GB AEP ES2, and 1x Intel® SSD DC S3710 Series 800 GB, 3x Intel® SSD DC P4600 Series 2.0 TB, 3x Intel® SSD DC Series S4600 1.9 TB capacity. BIOS version WW33’18. The operating system is SUSE Linux*4 Enterprise Server 15 and uses SAP HANA* 2.0 SPS 03 (a specific PTF Kernel from SUSE was applied) with a 1.3 TB data set. Average start time for optimized tables preload (17x improvement).