Field Analysis Demo

Field analysis is a calculation method that is intended for validating the expected locations of various discrete plant components (e.g. combiners). It identifies cloud motion within the individual component time series and infers their positions based upon relative timing of the cloud signal to each individual component. A full description of the technique is available in a paper: J. Ranalli and W. Hobbs, PV Plant Equipment Labels and Layouts can be Validated by Analyzing Cloud Motion in Existing Plant Measurements, Submitted to IEEE Journal of Photovoltaics.

This demo utilizes sample data on a single plant to demonstrate the application of the method.

[1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

from solarspatialtools import spatial, field

Read the Data

The sample data file is a compressed H5 file that contains all the info used to perform the analysis.

The Plant Layout

The layout of the plant is placed in the variable pos_utm. Its definition is contained in the field latlon, which produces a DataFrame of the combiner centroids in a UTM-like coordinate system. It contains combiner IDs as the index, and columns E and N that indicate the mean position of the combiner footprint (i.e. the centroid). The combiners are named following the pattern CMB-<INVERTER#>-<COMBINER#>.

[2]:
datafile = "data/sample_plant_2.h5"
pos_utm = pd.read_hdf(datafile, mode="r", key="utm")
pos_utm
[2]:
E N
CMB-001-1 23.584970 1086.290244
CMB-001-2 49.679324 1086.290365
CMB-001-3 84.519736 1086.290633
CMB-001-4 125.324187 1086.291124
CMB-001-5 160.281453 1086.291697
... ... ...
CMB-048-4 695.084107 31.788344
CMB-048-5 729.924241 31.791192
CMB-048-6 764.894729 31.794191
CMB-048-7 799.739358 31.797319
CMB-048-8 831.869055 31.800323

382 rows × 2 columns

The Time Series Data

The method relies on two discrete time periods with a fixed CMV. Two examples are stored in the data file with keys data_a and data_b. These each represent one hour of data for each combiner in the plant with a 10s sampling resolution. The DataFrames are indexed by a time that is artificially offset to begin at a time of 00:00:00 on an arbitrary day. The columns of the DataFrames have keys that match the index of pos_utm.

[3]:
ts_data_a = pd.read_hdf(datafile, mode="r", key="data_a")
ts_data_b = pd.read_hdf(datafile, mode="r", key="data_b")
ts_data_a
[3]:
CMB-001-1 CMB-001-2 CMB-001-3 CMB-001-4 CMB-001-5 CMB-001-6 CMB-001-7 CMB-002-1 CMB-002-2 CMB-002-3 ... CMB-047-7 CMB-047-8 CMB-048-1 CMB-048-2 CMB-048-3 CMB-048-4 CMB-048-5 CMB-048-6 CMB-048-7 CMB-048-8
2023-01-01 00:00:00 93.799402 93.985920 93.622077 94.318063 93.390276 93.459779 93.753268 95.253145 95.510679 95.718582 ... 92.871630 92.798612 91.074627 91.385004 91.554207 91.528461 91.404796 91.747153 91.461591 91.606280
2023-01-01 00:00:10 94.508405 94.819152 95.028923 95.105311 93.855248 94.410892 94.540710 95.633790 95.760610 95.699990 ... 92.753680 92.873880 91.292149 91.850400 91.644421 91.739311 91.842573 91.653509 91.475838 91.425386
2023-01-01 00:00:20 94.767218 95.245977 95.438032 95.826806 94.000549 94.737096 94.925867 95.603580 95.745574 95.265605 ... 92.933072 92.732030 91.251834 91.799432 91.926831 91.726960 91.657963 91.725119 91.453577 91.490994
2023-01-01 00:00:30 94.880949 95.406670 95.139340 95.771861 94.032838 94.730620 95.055108 95.470650 95.914626 96.075220 ... 92.942349 92.877739 91.443098 92.072665 91.937813 91.629913 91.534880 91.854972 91.431316 91.431947
2023-01-01 00:00:40 94.837813 95.538455 95.515880 95.826806 94.407401 94.698243 95.155677 95.337730 95.736634 95.525902 ... 92.886374 92.751329 91.496539 92.014166 91.867211 91.532872 91.664974 91.688921 91.409054 91.577226
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
2023-01-01 00:59:20 14.373001 13.225903 14.791491 13.385478 13.085343 13.466051 14.118774 13.721180 13.306969 8.306490 ... 14.580523 15.408295 14.735548 15.268503 15.645326 17.442542 17.913292 17.870185 16.114405 16.653684
2023-01-01 00:59:30 14.583977 13.338135 14.856895 13.349449 13.317424 13.456741 14.205222 13.811406 13.327289 8.690166 ... 14.611648 15.615766 14.689607 14.798921 14.540775 16.176576 18.011441 18.704340 17.892674 19.351115
2023-01-01 00:59:40 14.610643 13.486927 14.896615 13.516509 13.312177 13.673675 14.273695 13.934259 13.347608 8.382548 ... 14.247965 15.191175 14.563032 14.489767 14.585489 15.712535 15.359870 15.863494 18.463466 23.291351
2023-01-01 00:59:50 14.594957 13.737747 15.056860 13.734342 13.508587 13.867665 14.499656 13.958829 13.495051 8.701196 ... 14.179527 14.860188 14.610850 14.564967 14.809852 15.814871 15.672231 16.217617 15.594370 17.694980
2023-01-01 01:00:00 14.933424 13.887485 15.247237 13.952175 13.770243 14.039324 14.641018 14.113163 13.674277 9.230496 ... 14.203536 14.929559 14.533342 14.613296 14.461280 15.638823 15.624369 16.330934 15.905045 17.639993

361 rows × 382 columns

It’s necessary to provide the CMV for each time period. These precomputed values for the time periods in the data file are provided here. It’s also possible to compute them directly using methods in solarspatialtools.cmv. Note that it’s necessary for the CMVs to be roughly perpendicular, because the matrix calculation associated with triangulating the predicted position becomes singular for CMVs that are too close to parallel (or anti-parallel).

One other potential issue with CMVs is that the spatialsolartools.cmv module is also based on accurate knowledge of the combiner positions. In our experience, even plants with small numbers of scrambled combiners can still produce useful CMVs, because the position errors are averaged out when including all the combiners in the entire plant. A completely scrambled plant with no confidence in any of the combiner positions would likely require special handling, or CMVs from an independent source. But users should be aware of this as a potential limitation of the technique for very low confidence plant maps.

[4]:
cmv_a = spatial.pol2rect(9.52, 0.62)
cmv_b = spatial.pol2rect(8.47, 2.17)

Perform the Field Analysis

A high level execution of the field routine for a single CMV pair is performed by compute_predicted_position. This routine works on a single combiner at a time and on a single CMV pair at a time. Running it in a loop allows multiple combiners to be tested (only 10 points are shown here, due to computational time considerations). It returns a variable (pos) that holds the predicted position of the target combiner on the basis of other neighboring combiners that that serve as a spatial reference.

Results are stored in a new DataFrame (df).

[5]:
df = pd.DataFrame(index=pos_utm.index, columns=['E', 'N', 'com-E', 'com-N'])

for ref in pos_utm.index[46:62]:

    pos, _ = field.compute_predicted_position(
        [ts_data_a, ts_data_b],  # The dataframes with the two one hour periods
        pos_utm,  # the dataframe specifying the combiner positions
        ref,  # the combiner id to calculate the position for
        [cmv_a, cmv_b],  # The two individual CMVs for the DFs
        mode='preavg',  # Mode for downselecting the neighboring points
        ndownsel=8)  # Num points to use for downselecting

    # Add this combiner's calculated position to the output DataFrame
    df.loc[ref] = [pos_utm.loc[ref]['E'], pos_utm.loc[ref]['N'], pos[0], pos[1]]

Results

The following code generates a plot showing the location of the combiners in the entire plant. The red lines indicate the difference between the expected and calculated positions of the combiners tested. The CMVs for the two time periods are indicated by the green and blue arrows to the side of the plot.

The results shown here indicate combiners whose predicted position generally agrees with the expected position from the plant design. It can be helpful to repeat this calculation for different CMV pairs and to average the positions to determine the degree to which the predictions are repeatable and independent of the CMV.

[6]:
plt.figure(figsize=(6, 6))
plt.scatter(pos_utm['E'], pos_utm['N'])

for row in df.iterrows():
    r = row[1]
    plt.plot([r['E'], r['com-E']], [r['N'], r['com-N']], 'r-+')

# Plot some arrows to show the CMV
for cmv, color in zip([cmv_a, cmv_b],['green','blue']):
    velvec = np.array(spatial.unit(cmv)) * 100
    plt.arrow(-100, 600, velvec[0], velvec[1],
              length_includes_head=True, width=7, head_width=20, color=color)

plt.axis('equal')
plt.xlabel('E')
plt.ylabel('N')
plt.title(f'Predicted Positions')
axes = plt.gca()
axes.xaxis.set_ticklabels([])
axes.yaxis.set_ticklabels([])
plt.tight_layout()

plt.show()

../_images/demos_field_demo_11_0.png