“RapidMiner is the number 1 open source solution worldwide for process-based data analyses. Broadening the range of functions, which is already wide as it is, by way of so-called extensions has been part of our overall strategy since the beginning“, explains Dr. Ingo Mierswa, CEO of Rapid-I. “
PMML Extension
An important component in the integration of data mining results with other applications is the Predictive Model Markup Language (PMML)1. This serves for the exchange of data mining models between different applications and platforms. This means that models produced with RapidMiner can be directly transferred to a database which also supports the PMML standard and thus be easily led to the application. Apart from the exchangeability of data mining models, the possibility of applying models to large data sets directly in the database is an especially important topic for Rapid-I. Until now, RapidMiner users have been dependent on combinations of special access operators for selected modelling methods like Naive Bayes in order to ensure scalability for large volumes of data.
“With the new PMML Extension for RapidMiner and RapidAnalytics, the possibilities of using RapidMiner models for scoring large data sets have become greater, thus making the deployment of data mining a whole lot easier", says Mierswa.
Web Extension
The new web extension for RapidMiner and RapidAnalytics is available for stronger integration with information from the Internet. This extension is a complete revision of access methods from the text extension of the older versions. The web extension makes it possible to access all information from the web as well as combine it randomly. A step ahead of the crawler, which serves for collecting contents from the Internet for data analyses, the extension supports specific functions for work with webtexts as well as for docking onto other Internet sources such as RSS feeds. This means that enterprises can no longer only access their structured information in databases, but consult any data from the Internet for evaluation and thus increase their knowledge substantially.
Community Extension
With the new community extension, RapidMiner is bringing the basic idea of Web 2.0 into the data mining world. The extension offers a direct connection with the portal myExperiment.org, where users can discuss data mining processes. In addition, it is possible to communicate with data analysts who are working on similar analysis problems. The portal myExperiment.org already has an active community and also has all the characteristics of a social network. Thus the new extension makes uploading processes possible with one click as well as searching for your own processes and even those of other users if they are shared. These processes can then be very easily downloaded and applied to your own data.
“The community extension for myExperiment.org is a great addition to the new data and process repositories from RapidMiner 5 and is strongly based on the very successful basic idea of Wikipedia“, stresses Dr. Simon Fischer, Head of Research and Development at Rapid-I. “Based on these workflows Rapid-I will also develop further tools in the future for supporting users in the design of new processes.”
1 PMML has been a continuously developed standard based on XML since 1997. It is developed by the Data Mining Group (DMG), an independent group of providers which defines data mining standards. Rapid-I is a member of the data mining group just like IBM, MicroStrategy, SPSS and SAS.