Thursday, May 24, 2012

Current trends affecting predictive analytic

I was going through one article by Johan Blomme on predictive analytic and found really interesting. Here is a lil summary:

Traditionally, BI systems provided a retrospective view of the business by querying data warehouses containing historical data. Contrary to this, contemporary BI-systems analyze real-time event streams in memory. In today’s rapidly changing business environment, organizational agility not only depends on operational monitoring of how the business is performing but also on the prediction of future outcomes which is critical for a sustainable competitive position.
Predictive analytics leverages actionable intelligence that can be integrated in operational processes.

Current trends affecting predictive analytic:

·         Standards for Data mining and Model Deployment
·         Predictive Analytics in the Cloud
·         Structured and Un Structured Data types
·         Advance Database Technology (MPP, Column Based, In Memory..etc)

Standards for data mining and model deployment : CRISP-DM
o    A systematic approach to guide the data mining process has been developed by a consortium of vendor and users of data mining, known as Cross Industry Standard for Data Mining (CRISP-DM).
o    In the CRISP-DM model, data mining is described as an interactive process that is depicted in several phases (business and data understanding, data preparation, modeling, evaluation and deployment) and their respective tasks. Leading vendors of analytical software offer workbenches that make the CRISP-DM process explicit.

Standards for data mining and model deployment : PMML
o    To deliver a measurable ROI, predictive analytics requires a focus on decision optimization to achieve business objectives. A key element to make predictive analytics pervasive is the integration with commercial lines operations. Without disrupting these operations, business users should be able to take advantage of the guidance of predictive models.
o    For example, in operational environments with frequent customer interactions, high-speed scoring of real-time data is needed to refine recommendations in agent-customer interactions that address specific goals, e.g. improve retention offers. A model deployed for these goals acts as a decision engine by routing the results of predictive analytics to users in the form of recommendations or action messages.
o    A major development for the integration of predictive models in business applications is the PMML-standard (Predictive Model Markup Language) that separates the results of data mining from the tools that are used for knowledge discovery.

Structured and unstructured data types:
o    The field of advanced analytics is moving towards providing a number of solutions for the handling of big data. Characteristic for the new marketing data is its text-formatted content in unstructured data sources which covers « the consumer’s sphere of influence » : analytics must be able to capture and analyze consumer-initiated communication.
o    By analyzing growing streams of social media content and sifting through sentiment and behavioral data that emanates from online communities, it is possible to acquire powerful insights into consumer attitudes and behavior. Social media content gives an instant view of what is taking place in the ecosystem of the organization. Enterprises can leverage insights from social media content to adapt marketing, sales and product strategies in an agile way.
o    The convergence between social media feeds and analytics also goes beyond the aggregate level. Social network analytics enhance the value of predictive modeling tools and business processes will benefit from new inputs that are deployed. For example, the accuracy and effectiveness of predictive churn analytics can be increased by adding social network information that identifies influential users and the effects of their actions on other group members.

Advances in database technology : big data and predictive analytics
o    As companies gather larger volumes of data, the need for the execution of predictive models becomes more prevalent.
o    A known practice is to build and test predictive models in a development environment that consists of operational data and warehousing data. In many cases analysts work with a subset of data through sampling. Once developed, a model is copied to a runtime environment where it can be deployed with PMML. A user of an operational application can invoke a stored predictive model by including user defined functions in SQL-statements. This causes the RDBMS to mine the data iself without transferring the data into a separate file. The criteria expressed in a predictive model can be used to score, segment, rank or classify records.
o    An emerging practice to work with all data and directly deploy predictive models is in-database analytics. For example, Zementis (www.zementis.com) and Greenplum (www.greenplum.com) have joined forces to score huge amounts of data in-parallel. The Universal PMLL Plug-in developed by Zementis is an in-database scoring engine that fully supports the PMML-standard to execute predictive models from commerial and open source data mining tools within the database.

Predictive analytics in the cloud
o    While vendors implement predictive analytics capabilities into their databases, a similar development is taking place in the cloud. This has an impact on how the cloud can assist businesses to manage business processes more efficiently and effectively. Of particular importance is how cloud computing and SaaS provide an infrastructure for the rapid development of predictive models in combination with open standards. The PMML standard has yet received considerable adoption and combined with a service-oriented architecture for the design of loosely coupled systems, the cloud computing/SaaS model offers a cost-effective way to implement predictive models.
o    As an illustration of how predictive models can be hosted in the cloud, we refer to the ADAPA scoring engine (Adaptive Decision and Predictive Analytics, www.zementis.com). ADAPA is an on demand predictive analytics solution that combines open standards and deployment capabilities. The data infrastructure to launch ADAPA in the cloud is provided by Amazon Web Services (www.amazonwebservices.com). Models developed with PMML-compliant software tools (e.g. SAS, Knime, R, ..) can be easily uploaded in the ADAPA environment.
o    The on-demand paradigm allows businesses to use sophisticated software applications over the Internet, resulting in a faster time to production with a reduction of total cost of ownership.
o    Moving predictive analytics into the cloud also accelerates the trend towards self-service BI. The so-called democratization of data implies that data access and analytics should be available across the enterprise. The fact that data volumes are increasing as well as the need for insights from data, reinforce the trend for self-guided analysis. The focus on the latter also stems from the often long development backlogs that users experience in the enterprise context. Contrary to this, cloud computing and Saas enable organizations to make use of solutions that are tailored to specific business problems and complement existing systems.

No comments:

Post a Comment