Hi fellow hackers. This post is about an R package I am well underway on, and plan to submit to CRAN in the fall. I want to explain what it’s about and how I got to do this. At the end I’ll be asking for feedback, ideas and contributions if anyone feels interested. I can also provide tutorials for those looking to learn R, and am willing to learn from those who are more experienced and have advice. Here goes my summary…
I posted earlier about an AI project which never really got off the ground since I lacked any experience. However, the attempt to learn got me deep down a rabbit hole of coding for the first time. I installed an AI assistant (powered by Chat GPT and GPT-4). Using this assistant as an interactive textbook all summer, my coding skills exploded and I learned so much about how computers work in general. I learned R and Python really well and also dabbled in Med-Stat Notation (the programming language for the MedPC operating system I use at work for behavioral neuroscience experiments on rats).
No I dont just copy the code given to me. I always use concepts explained by my AI assistant to write my own because the darnn AI isn’t perfect. Thanks to all the people who pointed me towards GpT 3.5 btw.
Slowly I went from a minimally functional understanding of R, to an ability to write my own operators and functions. This took months but I am so grateful for those who suggested the use of the GPT tool. Amongst my early achievements are the following scripts:
1.) Ratcombinator. My first R script of note. You feed it the names of your male and female rats, along with performance data during some training days. It looks at all possible group combinations within certain stated parameters and gives you the best counterbalancing for the group assignment. If you are not familiar with the way experiments are done in my field the short version is we want two groups that we use for our experiment, which are as closely balanced in performance as possible. Ratcombinator does this mathematically and created the perfect groups. My plan is to turn it into a universally usable function soon and add it to my library.
2.) Stability Checker. One of the chores that needs to be done before any test is to prove that the rat performance is not significantly different depending on what day the test occurs. We want our rats to show stable performance for 3 training days in succession prior to a test with the actual test conditions. This requires running a 2 way ANOVA using data from the last 3 training days before each test day. This requires subsetting the data to include only those specific days and the rats in the group that is testing. I created functions that find the most recent training days automatically, given a list of training dates and test dates, and add them to a list, then subset the data along vectors in the list. Then I have a function that can run ANOVAs using the list of these subsets. This automates all the work and even saves the results in a labelled table which can be exported to excel for my superiors. All these useful functions and my own version of the ‘%in%’ operator from dplyr are all in my library and were created by me without help from my AI. I have constructed a series to list generating functions that can be used to run various statistical analyses using loops. The key is to make this process simple and easy even for us computer illiterate neuroscience majors.
3.) Drug Tracker. At some point I wanted to know exactly how much drug I was using on a test by test basis, so I could plan. I made a simple loop function that filtered data off an excel spreadsheet of rat weights and drug doses given by test date. As new tests are done, new data is added to the loop incorporating more test data into a table that calculates the exact mg amount of drug used. This amount is based on the concentration given to each rat multiplied its weight, plus a fixed volume times concentration of drug to control for the inevitably lost quantity discarded in the disposable needle tip after each injection. This results is an extremely accurate drug consumption log that allows our lab to know exactly when to expect yo run low and order more drugs. This feature even generates graphs to show remaining drug estimates visually (although the graphs are dependant on ggplot2).
4.) Fat Rat. This was my hopeful AI concept that would measure and predict rat weights, but alas never saw the light of day.
5,) Automated upload system. Using a mix of python and R, I created a series of scripts that are launched using windows task scheduler, and trigger on file creation events in my data folder where my rat data automatically saves after the rats finish their test programs. From there the files are backed up, exported to excel, read, data is cleaned and organized. Condition, sex, group and comment data is written in from a plethora of other data sheets I keep. All this stuff was done manually before but I’ve saved an hour a day programming a bot to do it flawlessly.
This is what I’ve been up to in my free time between my experiment work. I have decided to take this summer’s coding work and condense it into an R package called RatTools, which is easy for anyone to use. If you do rat work I would love to hear what you would find useful for yourself. Let me know if you are at all interested
I will release any of the aforementioned scripts if anyone wants to take a look. I love coding so far and am having a blast. Send me your ideas for functions I should include or any code you might wish to donate to my library if you have already written some. The only limitation is that I want no dependencies outside of the base R functions and operators. If you need something that isn’t in base R either write or ask me to write a function or operator to replace it. I look foreword to your feedback or contributions. Anyhow, thanks for reading. I would love to share my knowledge or increase it