What is RTutor?
RTutor is an AI-based tool for data analysis. It provides a natural language interface for users to interact with their data. It can be used to generate R and Python code for various statistical analyses and generate reports in HTML format. It is built on OpenAI’s powerful text-davinci-003 language model and can translate natural language into R and Python code.It supports data files in CSV, TSV/tab-delimited text files, and Excel formats. It can also detect data types automatically, convert numeric columns to factors, and generate descriptive summaries and plots. It can also generate code for correlation, GGpairs, and other analyses.RTutor also supports natural language processing in dozens of human languages, including Chinese, Ukrainian, Arabic, Hindi, Spanish, German, French, Luxembourgish, Vietnamese, Portuguese, Japanese, Italian, Persian and more.It can also be used to answer generic questions without mentioning column names as it can detect and understand context. RTutor is released as a prototype for testing and improvement.RTutor is a personal project of Steven Ge and is freely available for academic and non-profit organizations only. Commercial use beyond testing is not allowed.
Pros
- Generates R and Python code
- Supports CSV
- TSV
- Excel formats
- Auto-detects data types
- Generates descriptive summaries
- Generates correlation
- GGpairs analyses
- Supports numerous human languages
- Detects and understands context
- Detects data types automatically
- Auto-convert numeric to factors
- Available for academic
- non-profit
- Generates HTML reports
- Prototype for testing
- improvement
- Translates natural language to code
- Generic question answering
- Generates code chunks
- Data frame auto-loading as 'df'
- Voice input optional
- Supports multiple languages
- Code refinement for analytics
- Data visualization and exploration
- Generates and runs Python code
- Auto convert column as row names
- Generates HTML report
- Creates interactive plots
- Filtering options for data
- Provides code chunk records
- Sends error messages and results
Cons
- Only supports CSV
- TSV
- Excel formats
- Commercial use not allowed
- Still in prototype/testing phase
- Requires data preparation in Excel
- Does not execute large datasets (>10MB)
- Accuracy of generated code not guaranteed
- Generates code only in R and Python
- Does not include installed R packages
- Can fail due to server load
RTutor FAQ
What is RTutor?
RTutor is an artificial intelligence-based tool for data analysis. It employs a natural language interface for users to interact with their data. It can translate natural language into R and Python code, and execute numerous statistical analyses. RTutor can generate descriptive summaries and plots, and works equally well with data files in various formats such as CSV, TSV/tab-delimited text files, and Excel. It has a multilingual functionality, supporting several global languages.
How does RTutor translate natural language into R and Python code?
RTutor uses OpenAI's text-davinci-003 language model to translate natural language into R and Python code. Requests structured in natural language are processed through the AI model, which subsequently generates R and Python code. This code is then cleaned up and executed in a Shiny environment, displaying results or error messages as necessary.
How does RTutor support different languages for data analysis?
RTutor leverages the powerful capabilities of OpenAI’s text-davinci-003 language model, which supports various languages. This makes it possible for RTutor to process natural language instructions in many global languages, including but not limited to Chinese, Ukrainian, Arabic, Hindi, Spanish, German, French, Luxembourgish, Vietnamese, Portuguese, Japanese, Italian, and Persian.
What file formats does RTutor support for data analysis?
RTutor can analyse data files in CSV, TSV/tab-delimited text files, and Excel formats. Once uploaded, these data files are automatically loaded into RTutor as a data frame referred to as 'df'.
How does RTutor generate reports in HTML format?
RTutor logs multiple requests to produce an R Markdown file that includes the executable R code. This file can be knitted into an HTML report, enabling record keeping and reproducibility of the analysis and results.
Can I use RTutor to generate Python code?
Yes, besides R code, RTutor can also generate Python code from instructions given in natural language. As per recent updates, Python code generation and execution is one of the features RTutor boasts.
What statistical analyses can RTutor perform?
RTutor can perform various statistical analyses including but not limited to descriptive summaries, plots, correlation analysis, and GGpairs analysis. It can also generate code for these analyses in R and Python for further execution. The type of analysis RTutor performs largely depends on the natural language command given by the user.
How can RTutor help with data types detection and conversion?
RTutor is equipped with a feature that auto-detects data types and ensures they are appropriately cast for analysis. It can check if the data types are correct in terms of numeric columns juxtaposed with categories (factors or characters). If required, users can instruct RTutor to convert the data type by communicating, for example, 'Convert cyl as numeric' or 'Convert year as factor'.