Nestlogic ADP

Why anomalies matter

Most analytic technology use fits into three categories:
  • Find problems, and help fix them.
  • Find opportunities, and help exploit them.
  • Monitor to see whether there are any problems or opportunities that need attention.
Data anomalies are central to all three. For example, in internet marketing:
  • Data may arrive from a large number of internal and external systems, each with its own technology stack. The sooner you surface evidence of a malfunction, the sooner you can get it fixed.
  • Internet fraud can be a hugely expensive problem, and is only caught via “tells” that somehow distinguish it from genuine activity.
  • If you notice and analyze a small, localized increase in sales, you might find a way to multiply the effect.
Many enterprises face such challenges and opportunities.

Anomalies in big data

The big data era has introduced new challenges in anomaly identification and management.
  • Old-style anomaly detection typically looks for known data patterns, or for changes in pre-specified data metrics. But the variety, variability and complexity of big data render such techniques obsolete.
  • The alternative to known-pattern matching is to compare data to other data. But naive strategies of this kind have processing burdens that are exponential in data volume.
  • Velocity requirements have also increased, to the point that anomaly detection may soon need to be done at streaming speeds.
The tools used to visualize and analyze big data anomalies must also respect these challenges.

Active data profiling

Nestlogic’s innovative approach to big data anomaly management is called Active Data Profiling, and is embodied in a product line called Nestlogic ADP. The core assumptions behind ADP include:
  • Important anomalies will likely be found in segments of data, rather than individual records.
  • There are many variables or values that can delineate significant data segments, such as time, geography, technical indicators about internet clients, or business indicators about a system’s actual users.
  • Data profiles for different segments should be compared in as many ways as is practical, so as to maximize the chance of uncovering anomalies.
In line with those principles:
  • Nestlogic ADP profiles — i.e. models — big datasets and streams.
  • Anomalies are data segments whose profiles deviate greatly from what the models suggest.
 

Nestlogic ADP

Active data profilig

There are three main parts to Nestlogic ADP.
  • A stack of standard big data platform packages, including Hadoop, Spark, Kafka and others. (Nestlogic’s team has vast experience operating and using such technologies.)
  • Our algorithms for Active Data Profiling, which yield an unprecedented combination of breadth, precision and performance.
  • Tools to see, analyze and share the anomalies discovered.

Different packages

For different deployment needs

Standalone ADP

EASY TO GET A PROFILE FOR DOZEN GB OF DATA LOCALLY
Suitable for small deployment on single server or
Suitable for small deployment on single server or even workstation. You can run a small-scale analysis with this package up to a dozen GB of data. It’s easy to get started – download and follow the instructions.

Scalable ADP

GET A PROFILE OF DOZEN TB OF DATA PER DAY IN YOUR CLOUD-BASED ENVIRONMENT
Scalable version of Nestlogic ADP in your private or public cloud. Utilize the power of the cloud computing to discover a real profile of all your data in your own controlled environment.

ADP as a service

GET A PROFILE OF YOUR DATA IN CLOUD AS SCALABLE SERVICE
We can calculate the profile of your data using Nestlogic ADP as a service, providing a reliable way to take a look at the profile of your data from any device or location. Don’t change you data pipeline, let us do the job.

Q? WHAT PROBLEMS DO YOU SOLVE?

A. “Big data” commonly contains undetected errors or anomalies. We detect them, and make them easy to analyze, visualize and share. This can:

  • Uncover problems or opportunities you didn’t know you had.
  • Make any analysis that you are already performing more accurate.
  • Give your analysis a starting point.

Q? IS THIS FOR OPERATIONAL USE CASES OR IS IT AN AID TO DATA SCIENCE?

A. Yes.

Our user interfaces are designed to help you quickly identify data anomalies and diagnose their root causes, at operations-monitoring speeds. But the same kinds of results can be invaluable to focus your efforts in data science.

Q? WHAT IS YOUR PRODUCT CALLED?

A. Nestlogic ADP. The ADP stands for Active Data Profiling.

Q? HOW DOES YOUR PRODUCT WORK?

A. Nestlogic ADP calculates, for each segment of data, the degree to which it is anomalous or surprising. Then it provides you with this information via several user interfaces, including:

  • A news feed summarizing the surprises in your data.
  • Visualizations that support powerful drilldown.

Q? HOW IS NESTLOGIC ADP OFFERING PACKAGED AND DELIVERED?

A. NestLogic ADP is designed for the cloud. However, we can and do deploy it on customer premises if that is what they require.

Q? WHAT KINDS OF ERRORS AND ANOMALIES DO YOU FIND?

A. “Big data” commonly streams in from multiple sources over time. Examples of errors and anomalies we detect include:

  • Data has stopped coming in from particular machines.
  • Particular groups of users are much more common than usual. (This could be an indication of fraudulent traffic.)
  • Data from particular sources has changed in format, causing it to be recorded incorrectly.
  • Some kind of devices almost disappeared from some data source.

Q? HOW REAL-TIME IS THIS?

A. Very. Data is processed (and compared to historical data) as soon as it arrives.

Nestlogic ADP is field-tested at data freshness of 1 hour. A true streaming version is working in our labs.

Q? ISN’T IT HARD TO DO ALL THAT WITH REASONABLE PERFORMANCE?

A. Yes.

Q? CAN YOU DO IT WITH REASONABLE PERFORMANCE?

A. Yes.

Q? WHAT UNDERLYING TECHNOLOGIES DO YOU USE?

A. The main ones are Spark, Kafka, Hadoop and ElasticSearch.

Q? HOW CAN WE TRY NESTLOGIC ADP?

A. If you provide us with a sample of your data (real or dummied as you choose), we will be glad to load it into our system and show you the results. Of course, we also have demo data sets of our own.

Q? WANT TO KNOW MORE?

A. Please send us an e-mail to info@nestlogic.com or use the contact form