By using “Big Data” we can address clinical problems. Analyzing the massive amount of clinical information that is available in digital format will make it possible to create a rapid learning health care system in which we develop, validate and update predictive tools to assist clinicians in personalizing treatment. Yet, some hurdles have to be taken. Besides technological, privacy and security issues, the most important bottleneck is the quality of the available clinical data. To derive insights from data, it is critical that they are accurate and relatively complete. Thus, relevant variables should be collected and their definition should be clear. Also, machine learning algorithms require structured data while currently the richest source of clinical data, the clinicians’ notes, is unstructured. However, writing research protocols is time-consuming and many clinicians lack time to do so, although they recognize the importance of collecting high quality data. We therefore created this open source research protocol repository. We anticipate that this initiative will stimulate centers to participate in outcomes research and will improve standardization and quality of data.