Why anomalies matterMost analytic technology use fits into three categories:
- Find problems, and help fix them.
- Find opportunities, and help exploit them.
- Monitor to see whether there are any problems or opportunities that need attention.
- Data may arrive from a large number of internal and external systems, each with its own technology stack. The sooner you surface evidence of a malfunction, the sooner you can get it fixed.
- Internet fraud can be a hugely expensive problem, and is only caught via “tells” that somehow distinguish it from genuine activity.
- If you notice and analyze a small, localized increase in sales, you might find a way to multiply the effect.
Anomalies in big dataThe big data era has introduced new challenges in anomaly identification and management.
- Old-style anomaly detection typically looks for known data patterns, or for changes in pre-specified data metrics. But the variety, variability and complexity of big data render such techniques obsolete.
- The alternative to known-pattern matching is to compare data to other data. But naive strategies of this kind have processing burdens that are exponential in data volume.
- Velocity requirements have also increased, to the point that anomaly detection may soon need to be done at streaming speeds.
Active data profiling
- Important anomalies will likely be found in segments of data, rather than individual records.
- There are many variables or values that can delineate significant data segments, such as time, geography, technical indicators about internet clients, or business indicators about a system’s actual users.
- Data profiles for different segments should be compared in as many ways as is practical, so as to maximize the chance of uncovering anomalies.
- Nestlogic ADP profiles — i.e. models — big datasets and streams.
- Anomalies are data segments whose profiles deviate greatly from what the models suggest.
Active data profilig
- A stack of standard big data platform packages, including Hadoop, Spark, Kafka and others. (Nestlogic’s team has vast experience operating and using such technologies.)
- Our algorithms for Active Data Profiling, which yield an unprecedented combination of breadth, precision and performance.
- Tools to see, analyze and share the anomalies discovered.
For different deployment needs
EASY TO GET A PROFILE FOR DOZEN GB OF DATA LOCALLY
Suitable for small deployment on single server or
Suitable for small deployment on single server or even workstation. You can run a small-scale analysis with this package up to a dozen GB of data. It’s easy to get started – download and follow the instructions.
GET A PROFILE OF DOZEN TB OF DATA PER DAY IN YOUR CLOUD-BASED ENVIRONMENT
Scalable version of Nestlogic ADP in your private or public cloud. Utilize the power of the cloud computing to discover a real profile of all your data in your own controlled environment.
ADP as a service
GET A PROFILE OF YOUR DATA IN CLOUD AS SCALABLE SERVICE
We can calculate the profile of your data using Nestlogic ADP as a service, providing a reliable way to take a look at the profile of your data from any device or location. Don’t change you data pipeline, let us do the job.
Q? WHAT PROBLEMS DO YOU SOLVE?
A. “Big data” commonly contains undetected errors or anomalies. We detect them, and make them easy to analyze, visualize and share. This can:
- Uncover problems or opportunities you didn’t know you had.
- Make any analysis that you are already performing more accurate.
- Give your analysis a starting point.
Q? IS THIS FOR OPERATIONAL USE CASES OR IS IT AN AID TO DATA SCIENCE?
A. Yes.Our user interfaces are designed to help you quickly identify data anomalies and diagnose their root causes, at operations-monitoring speeds. But the same kinds of results can be invaluable to focus your efforts in data science.
Q? WHAT IS YOUR PRODUCT CALLED?
A. Nestlogic ADP. The ADP stands for Active Data Profiling.
Q? HOW DOES YOUR PRODUCT WORK?
A. Nestlogic ADP calculates, for each segment of data, the degree to which it is anomalous or surprising. Then it provides you with this information via several user interfaces, including:
- A news feed summarizing the surprises in your data.
- Visualizations that support powerful drilldown.
Q? HOW IS NESTLOGIC ADP OFFERING PACKAGED AND DELIVERED?
A. NestLogic ADP is designed for the cloud. However, we can and do deploy it on customer premises if that is what they require.
Q? WHAT KINDS OF ERRORS AND ANOMALIES DO YOU FIND?
A. “Big data” commonly streams in from multiple sources over time. Examples of errors and anomalies we detect include:
- Data has stopped coming in from particular machines.
- Particular groups of users are much more common than usual. (This could be an indication of fraudulent traffic.)
- Data from particular sources has changed in format, causing it to be recorded incorrectly.
- Some kind of devices almost disappeared from some data source.