Creating a box plot
rhoover
Hello! I am trying to create a box and whisker plot but am struggling to get it to display the data how I would like. I would like to create a box plot where text labels are from text column "Unit4" and data points are from "modeled_kT_DPB4". Not every data point from "modeled_kT_DPB4" has a "Unit4" label so I want to skip those points. Is that possible? Attaching a screenshot of what my data look like.
Hi,
I would use Extract to create subsets of the waves where you have a label. A caveat is a null str will return a NaN, we can test for that with numtype and 2 being a NaN.
Extract unit4, unit4_sub, numtype(strlen(Unit4))!=2
Extract modeled_kT_DPB4, modeled_kT_DPB4_sub, numtype(strlen(Unit4))!=2
Create the box and whisker on the "sub" waves
Andy
March 17, 2024 at 04:19 pm - Permalink
Hi Andy,
When I do this and create a box plot I am still only getting 1 box that combines all of my modeled_kT data. How do I split it up to individual boxes based on Unit?
March 17, 2024 at 05:11 pm - Permalink
You'll want to make a multi-column wave for your data, so that each unit has its own column. That way each Unit column will become its own category in the plot
March 18, 2024 at 08:41 am - Permalink
Hi,
Worked the problem a bit and wrote a little function.
It takes three inputs, the text wave with labels, the wave with the data, and a flag on what to do with labels that are empty,"", keep them as "Other" or ignore them. It creates/uses a directory called "box" with the data waves and a corresponding label wave. In the dialog box select the data waves as waves to plot and the unique labels as the x wave.
//LW is label wave with the names of groupings
//Dw is data wave that need to get reconfigured
//Keep is a flag as retain unlabelled entries as "other", 0=delete,1 change to other
//Get number of entries
variable index,maxindex
DFREF Saved = getDataFolderDFR()
Newdatafolder/S/O root:box
findduplicates /RT=uniqueLabels lw
maxindex=numpnts(uniquelabels)
for(index=0;index<maxindex;index+=1)
if(strlen(uniquelabels[index])!=0 || keep)
extract/O DW,$"DW_"+num2str(index), stringmatch(LW,Uniquelabels[index])
if(strlen(uniquelabels[index])==0)
uniquelabels[index] = "Other"
endif
endif
endfor
setdataFolder saved
end
Andy
March 18, 2024 at 08:48 am - Permalink
Thanks so much! I had to edit a little to save the data into a different folder but otherwise seems to work well. I really appreciate the help!
March 18, 2024 at 11:26 am - Permalink
Kind of as a follow up to this...I am wanting to also calculate the lognormal mean of my data for each new DW. They seem to have a lognormal distribution so I am using the attached equation.
I am first creating a table...
Make/O/N=(numnts(uniquelabels)) MLEMean
Then trying to populate my table with this mean values by
MLEMean[0]=1/numpnts(DW_1)*sum(ln(DW_1))
MLEMean[1]=1/numpnts(DW_2)*sum(ln(DW_2))
MLEMean[2]=1/numpnts(DW_3)*sum(ln(DW_3))
Etc...
However, Igor doesn't like that I have a function and not a wave variable within the sum brackets. Any good way around this?
Thanks!
March 18, 2024 at 04:56 pm - Permalink
The error is because the ln function expects a number, but if you put that expression in a wave assignment, each point in the wave will get passed to the ln function. You could do something like this:
Make/FREE lnWave
lnWave = ln(dw) // operates on each point in the wave b/c of the assignment
return lnWave
End
and then call:
MLEMean[0]=1/numpnts(DW_1)*sum(naturalLog(DW_1))
March 19, 2024 at 09:29 am - Permalink
Yeah, the sum() function requires a wave, not a number returned from a function.
The MatrixOP operation offers (in this case) a more natural way to do this. Here is an example:
MatrixOP/O term1 = (1/numPoints(DW_1))*sumcols(ln(DW_1))
The peculiarity here is the "term1" is a one-point wave, not a variable. But it's easy to handle that. And MatrixOP is all about functions being applied to waves and returning waves. As the name implies, it is more oriented toward matrices; and that's why you need "sumcols" and not just "sum". You are working with a one-column matrix here :)
March 19, 2024 at 09:39 am - Permalink
Ok, that makes sense. thank you!
March 20, 2024 at 01:50 pm - Permalink
And my colleague AG here has pointed out that you can replace that expression with this:
MatrixOP averageCols(ln(DW_1))
March 20, 2024 at 05:26 pm - Permalink
In reply to The error is because the ln… by Ben Murphy-Baum
lnWave needs to be the same length as the input wave dw, methinks:
Make/FREE/N=(DimSize(dw,0)) lnWave // assuming 1-D dw
lnWave = ln(dw) // operates on each point in the wave b/c of the assignment
return lnWave
End
March 28, 2024 at 11:23 am - Permalink