Know how to run processes on the server in a variety of methods. In all cases, the process should first be saved in a server repository.
Be able to use:
Remote Execution for on-demand development and non-repetitive tasks. Process Scheduler for regular or near continuous tasks.
Web Services or Web Apps for deployment.
Triggers from file changes or e-mails.
Understand Model Deployment. It involves additional consideration for quality control, monitoring, and access control.
Understand Release Management. Processes and models meant for deployment should go through release management with development, test, and production phases. Environments should be planned for each phase, and transitions including rollbacks should be planned.
Understand Versioning of processes on the server.
Logging operators are used heavily and in many ways.
Be able to use the Handle Exception and Throw Exception operators. They can help to make sure errors are handled gracefully and relevant information is captured.
Be able to store results. After Scoring, the results often need to be stored in a file or database.
Be able to use Explain Predictions. It can help decision makers interpret the model.
Be able to create and use Web Services. They provide a flexible tool to integrate RapidMiner models and Processes into other business systems; authentication to the RapidMiner Server is required by default. Anonymous Web Services may be turned on to bypass the RapidMiner authentication requirements.
Understand Model Management. It continues after a model has been deployed to production.
Know how to Measure Performance over Time. This provides a way to observe when a model’s validity may expire due to changes in the business or other external changes.
Be able to calculate ROI. It is an important part of evaluating projects. If estimates for the value or cost of different correct or incorrect decisions are available, and estimates for the number of decisions, then the calculation is simple and straightforward. It’s often useful to use the Performance (Cost) operator to calculate model value on observed data.
Be able to calculate ROI. It is an important part of evaluating projects. If estimates for the value or cost of different correct or incorrect decisions are available, and estimates for the number of decisions, then the calculation is simple and straightforward. It’s often useful to use the Performance (Cost) operator to calculate model value on observed data.
Know how to monitor against thresholds and throw alerts. Each business problem has different thresholds, constraints, and boundaries. Not only does model performance need to be monitored, but input and output data need to be monitored for both volume and values. When data crosses a threshold, an alert may be sent, often starting with Throw Exception.
Understand why and how to watch for Sample Deviations and Bias. Training and testing data may be different from each other, or from the underlying source of data. This may be from random chance of a sample, or it may reflect bias in the data gathering process. It’s particularly important to watch for this sort of bias in projects that could have social implications.
Understand what Web Apps are and what they can do. They are web interfaces where users can see, explore, and change the data. This interactivity can be created without coding or relying on other software.
Be able to create Web Apps. From the server web interface, you can access the App Designer which is used to create Web Apps. Development of Web Apps and the related processes should be coupled. This will help management between variables and app objects in the Web App, with macros and published RapidMiner objects in the processes.
Be able to Manage Web Apps. This is also done from the App Designer. It allows you to delete them, change their location, or Deploy them.