rOpenSci | Blog

All posts (Page 92 of 130)

Wednesday, February 14, 2018

Earlier this month we released a new version of the tesseract package to CRAN. This package provides R bindings to Google’s open source optical character recognition (OCR) engine Tesseract.

Two major new features are support for HOCR and support for the upcoming Tesseract 4.

🔗
hOCR output

Support for HOCR output was requested by one of our users on Github. The ocr() function gains a parameter HOCR which allows for returning results in hOCR format:

...

By Jeroen Ooms

Wednesday, February 14, 2018

Introducing the 2018 rOpenSci Research Fellows!

rOpenSci’s mission is to enable and support a thriving community of researchers who embrace open and reproducible research practices as part of their work. Since our inception, one of the mechanisms through which we have supported the community is by developing high-quality open source tools that lower barriers to working with scientific data. Equally important to our mission is to build capacity and promote researchers who are engaged in such practices within their disciplinary communities. This fellowship program is a unique opportunity for us to enable such individuals to have a bigger voice in their communities....

By Karthik Ram

Friday, February 9, 2018

.rprofile: Julia Stewart Lowndes

Thursday, February 8, 2018

Apply to attend rOpenSci unconf 2018!

For a fifth year running, we are excited to announce the rOpenSci unconference, our annual event loosely modeled on Foo Camp. rOpenSci unconferences have a rich history. You can get a feel for them by reading collected stories about people and projects from unconf17.

We’re organizing unconf18 to bring together scientists, developers, and open data enthusiasts from academia, industry, government, and non-profits to get together for a couple of days to hack on various projects and generally enrich our community. The agenda is mostly decided during the unconference itself. Past projects have related to open data, data visualization, data publication and open science using R. This event is unlike many other unconferences in that it is primarily invite-only, with a few spots set aside for self-nominations from the community at large. That’s you!

...

By Stefanie Butland

Tuesday, February 6, 2018

The prequel to the drake R package

The drake R package is a pipeline toolkit. It manages data science workflows, saves time, and adds more confidence to reproducibility. I hope it will impact the landscapes of reproducible research and high-performance computing, but I originally created it for different reasons. This post is the prequel to drake’s inception. There was struggle, and drake was the answer.

🔗
Dissertation frustration

My dissertation project was intense. The final computational challenge was to analyze multiple genomics datasets using an emerging method and its competitors. Even with GPU computing, which shrank days of runtime down to hours, the full battery of Markov chain Monte Carlo runs took several weeks from start to finish. I organized my workflow as an R package, and I worked in a loop:

...

By Will Landau