Data loading from XML file
johnweeks
Since my Chevy Volt is now almost a year old, I'd like to analyze my electric usage patterns. Portland General Electric (PortlandGeneral.com), if you look hard enough, has a link where you can download hourly usage covering a considerable time period. I guess to make sure that such a useful thing wasn't overly helpful :), they pack it into an XML file of their own design. A support person from PGE suggested that it was formatted for some specific third party. Certainly not for me, as I have managed to avoid learning much about XML.
So I was wondering if perhaps the worthies who inhabit IgorExchange might have some advice on how to proceed. To that end, I have attached a small sample of the data file.
Thanks for any insight.
One of the nodes is:
<IntervalReading><br />
<timePeriod><br />
<duration>3600</duration><br />
<start>1509537600</start><br />
</timePeriod><br />
<value>742</value><br />
</IntervalReading><br />
It looks like you used 742 of something in an hour, starting at an epoch time of 1509537600.
December 26, 2017 at 04:59 pm - Permalink
I can figure out the epoch from the dates covered by the file....
... and that large number appears to be a Posix epoch for Wednesday, November 1, 2017 12:00:00 PM GMT, which makes sense since I took a week of data starting with Nov. 1, 2017.
I think I'm on the way. It would be nice if PGE made this available in a way that didn't involve learning a new technology. Or maybe I could set myself up as a consultant who can decipher electric usage :)
John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
December 26, 2017 at 10:10 pm - Permalink
•print(fid)
74
// this gets duration/start/value, but you also get blank space from the nodes.
// that's why we're going to use sockitwavetostring, sockitstringtowave
// if you use "//ns:value" you won't need to jump through those hoops.
•xmlwavefmxpath(fid, "//ns:IntervalReading", "ns=http://naesb.org/espi", "\r\n")
•string nums;
•sockitwavetostring/Txt=" " M_xmlcontent, nums
// all numbers appear to be integers
•sockitstringtowave/Tok=" " 2^7, nums
edit W_stringtowave
xmlclosefile(fid, 0)
December 27, 2017 at 12:38 am - Permalink
December 27, 2017 at 12:39 am - Permalink
December 27, 2017 at 07:52 am - Permalink
December 27, 2017 at 08:06 am - Permalink
But now I'm actually getting involved with this as an XPath learning tool. But I'm already stuck trying to understand. Based on the list you get from XMLelemlist(), I did this:
xmlwavefmxpath(73, "/*", "", "\n")
That gets lots of output, with lots of stuff I don't care about. But it seemed like this should get the same output, based on the material at the www.w3schools.com site:
xmlwavefmxpath(73, "/feed", "", "\n")
But that gets nothing. I looked at your example, and tried this:
xmlwavefmxpath(73, "/ns:feed", "", "\n")
That gets me an error: XMLutils: Failure with XPath expression
Edit: These questions are kind of stream-of-consciousness. Might be good to start at 4...
So:
1) What is "ns"?
2) That tutorial keeps talking about "the current node". Does that mean the node that you presently are at while parsing the Xpath?
3) It seems like I should be able to use literal node names, like "field" or "value". Why doesn't that work as in my attempt above?
... some time passes ...
4) I see that your example appears to define "ns" by setting it in the second string input. So is "//ns:IntervalReading" the same as "http://naesb.org/espi:IntervalReading"?
5) I take it that the namespaces are defined by strings like xmlns="http://naesb.org/espi" which is an attribute of UsagePoint node. Do all children of UsagePoint have to be qualified by the namespace?
6) This: xmlwavefmxpath(73, "//ns:UsagePoint/ServiceCategory/kind", "ns=http://naesb.org/espi", "\r\n") gets me nothing, but
This: xmlwavefmxpath(73, "//ns:kind", "ns=http://naesb.org/espi", "\r\n")
gets me a single row "0" which is, in fact, the value of UsagePoint/ServiceCategory/kind
That's probably enough questions for now. Thanks for your help!
John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
December 28, 2017 at 05:01 pm - Permalink
When you get this error this means there is a problem with the XPath expression with respect to the file. E.g. you didn't use the right namespace.
Namespace - https://en.wikipedia.org/wiki/XML_namespace
a mixture of not using the qualified namespace and a problem with the XPath expression will be the cause.
you need to define the namespace when you want to examine specific node names. For example you could do "//weeksies:IntervalReading" with "weeksies=http://naesb.org/espi"
I think so.
I think you'd need to do:
xmlwavefmxpath(73, "//weeksies:UsagePoint/weeksies:ServiceCategory/weeksies:kind", "weeksies=http://naesb.org/espi", "\r\n")
That XPath would get all the "kind" nodes that are direct daughters of "ServiceCategory" which are direct daughters of all the "UsagePoint" nodes. It would not match "kind" nodes that didn't follow that direct ancestry.
XPath is very powerful, but there is a learning curve. I normally have to resort to reading XPath tutorials and a bit of trial and error to get things to work as I want.
A.
December 28, 2017 at 06:29 pm - Permalink
http://www.igorexchange.com/project/udStFiLrXML
I might therefore boldly or naively say, the use of XSL could be a more useful method, if for no other reason that you can use the XSL in other ways (e.g. at the shell level) and obtain a host of other output formats (e.g. CSV, RTF, HTML ...). I _think_ that XSL is also a bit more transparent in syntax settings to parse the tree at various levels.
But then, to mirror Andy's experience, even on the best of days, I generally had to rely on trial and error as well as tutorials to determine how to load deeply-nested data.
And then I woke up one morning and found everything is now about JSON :-)
--
J. J. Weimer
Chemistry / Chemical & Materials Engineering, UAH
December 29, 2017 at 05:50 am - Permalink
I'll try your suggestions, Andy. Trial and error works for me. You use a lot of my trial and error :)
John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
December 29, 2017 at 07:29 am - Permalink
Two calls to xmlwavefmxpath() extracted each hourly reading, and the start time of that reading. Here are the calls I used:
xmlwavefmxpath(72, "//ns:IntervalReading//ns:value", "ns=http://naesb.org/espi", "\r\n")
That results in M_xmlcontent waves with *only* the desired info, which makes it easy to extract to numeric waves:
times = str2num(M_xmlcontent[0][p])
times += 2082844800 // offset from 1/1/1904 to 1/1/1970, converts from Unix epoch to Apple epoch
make/D/N=(DimSize(M_xmlcontent, 1))/O watts
watts = str2num(M_xmlcontent[0][p])
Naturally, since the time periods are all the same, I set the X scaling of my watts wave appropriately. Sure is nice that the Change Wave Scaling dialog can work in date/time!
Graph is attached. You can see where I plug in my Volt, usually about 6:00 PM when I get home from WaveMetrics. But Wednesday is squash league night and I don't plug in until around 10 or 11 PM. If I had a 240V level 2 charger, those peaks would be higher and shorter. But for my lifestyle, every night charging with the portable charger is generally enough and its hard to justify a $400+ charger and installation of a 240V circuit for it.
John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
December 29, 2017 at 10:51 am - Permalink
The ultimate goal is to apply the published rates for time-of-day pricing and compare with my current constant-price plan.
John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
December 29, 2017 at 10:53 am - Permalink
The XSL option would likely still be a better option even considering the already-invested costs to learn xpath and XMLutils when you want to work the XML also using something besides Igor Pro and/or extract the XML to some other format besides Igor Pro data.
--
J. J. Weimer
Chemistry / Chemical & Materials Engineering, UAH
December 29, 2017 at 05:19 pm - Permalink
John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
December 30, 2017 at 10:46 pm - Permalink
[1]: https://en.wikipedia.org/wiki/Simple_API_for_XML
January 4, 2018 at 09:44 am - Permalink
John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
January 4, 2018 at 11:42 am - Permalink
John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
January 4, 2018 at 11:44 am - Permalink
XMLutils is designed to write XML files as well, but it will slow down on enormous files.
January 4, 2018 at 11:51 pm - Permalink
It seems not so difficult to read the usefull infomation using freadline and regular expressions...
January 5, 2018 at 07:21 am - Permalink