Field Analysis Demo

Field analysis is a calculation method that is intended for validating the expected locations of various discrete plant components (e.g. combiners). It identifies cloud motion within the individual component time series and infers their positions based upon relative timing of the cloud signal to each individual component. A full description of the technique is available in a paper: J. Ranalli and W. Hobbs, PV Plant Equipment Labels and Layouts can be Validated by Analyzing Cloud Motion in Existing Plant Measurements, Submitted to IEEE Journal of Photovoltaics.

This demo utilizes sample data on a single plant to demonstrate the application of the method.

[1]:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

from solarspatialtools import spatial, field

Read the Data

The sample data file is a compressed H5 file that contains all the info used to perform the analysis.

The Plant Layout

The layout of the plant is placed in the variable pos_utm. Its definition is contained in the field latlon, which produces a DataFrame of the combiner centroids in a UTM-like coordinate system. It contains combiner IDs as the index, and columns E and N that indicate the mean position of the combiner footprint (i.e. the centroid). The combiners are named following the pattern CMB-<INVERTER#>-<COMBINER#>.

[2]:

datafile = "data/sample_plant_2.h5"
pos_utm = pd.read_hdf(datafile, mode="r", key="utm")
pos_utm

[2]:

	E	N
CMB-001-1	23.584970	1086.290244
CMB-001-2	49.679324	1086.290365
CMB-001-3	84.519736	1086.290633
CMB-001-4	125.324187	1086.291124
CMB-001-5	160.281453	1086.291697
...	...	...
CMB-048-4	695.084107	31.788344
CMB-048-5	729.924241	31.791192
CMB-048-6	764.894729	31.794191
CMB-048-7	799.739358	31.797319
CMB-048-8	831.869055	31.800323

382 rows × 2 columns

The Time Series Data

The method relies on two discrete time periods with a fixed CMV. Two examples are stored in the data file with keys data_a and data_b. These each represent one hour of data for each combiner in the plant with a 10s sampling resolution. The DataFrames are indexed by a time that is artificially offset to begin at a time of 00:00:00 on an arbitrary day. The columns of the DataFrames have keys that match the index of pos_utm.

[3]:

ts_data_a = pd.read_hdf(datafile, mode="r", key="data_a")
ts_data_b = pd.read_hdf(datafile, mode="r", key="data_b")
ts_data_a

[3]:

	CMB-001-1	CMB-001-2	CMB-001-3	CMB-001-4	CMB-001-5	CMB-001-6	CMB-001-7	CMB-002-1	CMB-002-2	CMB-002-3	...	CMB-047-7	CMB-047-8	CMB-048-1	CMB-048-2	CMB-048-3	CMB-048-4	CMB-048-5	CMB-048-6	CMB-048-7	CMB-048-8
2023-01-01 00:00:00	93.799402	93.985920	93.622077	94.318063	93.390276	93.459779	93.753268	95.253145	95.510679	95.718582	...	92.871630	92.798612	91.074627	91.385004	91.554207	91.528461	91.404796	91.747153	91.461591	91.606280
2023-01-01 00:00:10	94.508405	94.819152	95.028923	95.105311	93.855248	94.410892	94.540710	95.633790	95.760610	95.699990	...	92.753680	92.873880	91.292149	91.850400	91.644421	91.739311	91.842573	91.653509	91.475838	91.425386
2023-01-01 00:00:20	94.767218	95.245977	95.438032	95.826806	94.000549	94.737096	94.925867	95.603580	95.745574	95.265605	...	92.933072	92.732030	91.251834	91.799432	91.926831	91.726960	91.657963	91.725119	91.453577	91.490994
2023-01-01 00:00:30	94.880949	95.406670	95.139340	95.771861	94.032838	94.730620	95.055108	95.470650	95.914626	96.075220	...	92.942349	92.877739	91.443098	92.072665	91.937813	91.629913	91.534880	91.854972	91.431316	91.431947
2023-01-01 00:00:40	94.837813	95.538455	95.515880	95.826806	94.407401	94.698243	95.155677	95.337730	95.736634	95.525902	...	92.886374	92.751329	91.496539	92.014166	91.867211	91.532872	91.664974	91.688921	91.409054	91.577226
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
2023-01-01 00:59:20	14.373001	13.225903	14.791491	13.385478	13.085343	13.466051	14.118774	13.721180	13.306969	8.306490	...	14.580523	15.408295	14.735548	15.268503	15.645326	17.442542	17.913292	17.870185	16.114405	16.653684
2023-01-01 00:59:30	14.583977	13.338135	14.856895	13.349449	13.317424	13.456741	14.205222	13.811406	13.327289	8.690166	...	14.611648	15.615766	14.689607	14.798921	14.540775	16.176576	18.011441	18.704340	17.892674	19.351115
2023-01-01 00:59:40	14.610643	13.486927	14.896615	13.516509	13.312177	13.673675	14.273695	13.934259	13.347608	8.382548	...	14.247965	15.191175	14.563032	14.489767	14.585489	15.712535	15.359870	15.863494	18.463466	23.291351
2023-01-01 00:59:50	14.594957	13.737747	15.056860	13.734342	13.508587	13.867665	14.499656	13.958829	13.495051	8.701196	...	14.179527	14.860188	14.610850	14.564967	14.809852	15.814871	15.672231	16.217617	15.594370	17.694980
2023-01-01 01:00:00	14.933424	13.887485	15.247237	13.952175	13.770243	14.039324	14.641018	14.113163	13.674277	9.230496	...	14.203536	14.929559	14.533342	14.613296	14.461280	15.638823	15.624369	16.330934	15.905045	17.639993

361 rows × 382 columns

It’s necessary to provide the CMV for each time period. These precomputed values for the time periods in the data file are provided here. It’s also possible to compute them directly using methods in solarspatialtools.cmv. Note that it’s necessary for the CMVs to be roughly perpendicular, because the matrix calculation associated with triangulating the predicted position becomes singular for CMVs that are too close to parallel (or anti-parallel).

One other potential issue with CMVs is that the spatialsolartools.cmv module is also based on accurate knowledge of the combiner positions. In our experience, even plants with small numbers of scrambled combiners can still produce useful CMVs, because the position errors are averaged out when including all the combiners in the entire plant. A completely scrambled plant with no confidence in any of the combiner positions would likely require special handling, or CMVs from an independent source. But users should be aware of this as a potential limitation of the technique for very low confidence plant maps.

[4]:

cmv_a = spatial.pol2rect(9.52, 0.62)
cmv_b = spatial.pol2rect(8.47, 2.17)

Perform the Field Analysis

A high level execution of the field routine for a single CMV pair is performed by compute_predicted_position. This routine works on a single combiner at a time and on a single CMV pair at a time. Running it in a loop allows multiple combiners to be tested (only 10 points are shown here, due to computational time considerations). It returns a variable (pos) that holds the predicted position of the target combiner on the basis of other neighboring combiners that that serve as a spatial reference.

Results are stored in a new DataFrame (df).

[5]:

df = pd.DataFrame(index=pos_utm.index, columns=['E', 'N', 'com-E', 'com-N'])

for ref in pos_utm.index[46:62]:

    pos, _ = field.compute_predicted_position(
        [ts_data_a, ts_data_b],  # The dataframes with the two one hour periods
        pos_utm,  # the dataframe specifying the combiner positions
        ref,  # the combiner id to calculate the position for
        [cmv_a, cmv_b],  # The two individual CMVs for the DFs
        mode='preavg',  # Mode for downselecting the neighboring points
        ndownsel=8)  # Num points to use for downselecting

    # Add this combiner's calculated position to the output DataFrame
    df.loc[ref] = [pos_utm.loc[ref]['E'], pos_utm.loc[ref]['N'], pos[0], pos[1]]

Results

The following code generates a plot showing the location of the combiners in the entire plant. The red lines indicate the difference between the expected and calculated positions of the combiners tested. The CMVs for the two time periods are indicated by the green and blue arrows to the side of the plot.

The results shown here indicate combiners whose predicted position generally agrees with the expected position from the plant design. It can be helpful to repeat this calculation for different CMV pairs and to average the positions to determine the degree to which the predictions are repeatable and independent of the CMV.

[6]:

plt.figure(figsize=(6, 6))
plt.scatter(pos_utm['E'], pos_utm['N'])

for row in df.iterrows():
    r = row[1]
    plt.plot([r['E'], r['com-E']], [r['N'], r['com-N']], 'r-+')

# Plot some arrows to show the CMV
for cmv, color in zip([cmv_a, cmv_b],['green','blue']):
    velvec = np.array(spatial.unit(cmv)) * 100
    plt.arrow(-100, 600, velvec[0], velvec[1],
              length_includes_head=True, width=7, head_width=20, color=color)

plt.axis('equal')
plt.xlabel('E')
plt.ylabel('N')
plt.title(f'Predicted Positions')
axes = plt.gca()
axes.xaxis.set_ticklabels([])
axes.yaxis.set_ticklabels([])
plt.tight_layout()

plt.show()