Written by Zack Downey, Data Scientist, Ursa Space Systems
Ithaca, NY – January 9, 2018 – In the world of geospatial data, there are many important aspects of data collection and data clean-up (structuring). Oftentimes, though, these are overlooked or assumed to exist, which can be a fatal flaw to the user for correctly utilizing this information – resulting in a lack of both the purity and cleanliness needed to be high-quality to the end user.
When it comes to deriving global energy data from satellite imagery, there are a number of these crucial elements. For one, there are a variety of satellite based sensors to choose from, such as Optical and Synthetic Aperture Radar (SAR), which can produce radically different results (see Fig 1).
Not only is the type of sensors used important, but the methods of collection and the measurement approaches can also be essential to building a valuable data product. For instance, the following can all impact quality:
- The frequency of collect
- Consistent area collects
- The use of direct measurement versus estimation and interpolation
Let’s hone-in on that last point: the difference between directly measuring as opposed to interpolating the data. When the information is directly measured, with a strict QA process in place, there is no interpolation between data points – meaning, there is no estimation being done and less handling and structuring of the data is needed. Less handling means less opportunity for human error, which should lead to a much more reliable data set.
The impact of collection frequency and area consistency is probably more obvious – the more often the data is collected and the more that data is collected from the same areas each time, the higher the quality and consistency.
Real Life Example: Global Oil Storage Data
Let’s look at the weekly changes in Dalian, China, in the past several months:
At the end of November, 2017, the drastic draw in Dalian inventories led to a lower end-of-month total China inventory level. This draw and subsequent build was essential to have an accurate end of month measurement for China crude balances. If the data point on 11/30/17 was not acquired and directly measured and was instead interpolated, this would have been incorrectly represented as an interpolation between the data point on 11/23/17 and 12/7/17. This improper or artificial data point would have given an improper month to month inventory change and could damage the accuracy of a global balance calculation.
Why Ursa’s Global Oil Storage Data
In Ursa’s Global Oil Storage product, this principle of weekly and direct measurement is essential. By measuring every oil tank, every week, our users know that they get a regular, high-quality update that they can rely on for a variety of use cases (such as the China oil example above).
In summary, if you’re looking for global energy data, particularly in a time series, reliability and direct measurements matter. Using other technologies (besides SAR), not pulling data frequently enough and not pulling consistently from the same sources may cause missing measurements and force a data provider to fill in the gaps with estimation and statistical interpolation, potentially leaving out major weekly changes. This artificial generation of data, especially in regards to physical movements, can cause major issues in end usage and model generation – leading to less reliable data and ultimately less confidence in using the data to make the best decisions.