CommCare Data and Tableau

This post contains brief notes on the key info when accessing the CommCare survey data in the data warehouse (via Snowflake or published data sources),

Version control

2022 June 29 ML: updated 'Table relations : how to link CommCare tables to each other' : correct linking now also requires using field 'For Project WISE?'.

2022 May 20 ML: 1st draft.

Which Tables to Choose from the DWH Table list?

Names of the Tables with CommCare data always start with F_ (for example, F_1_SITE_INFORMATION).

These are main CommCare Tables, please use the tables with these exact names:

  1. F_1_SITE_INFORMATION  
  2. F_2_WATER_AND_HYGIENE
  3. F_3_SANITATION
  4. F_4_STRUCTURED_OBSERVATION
  5. F_SELECT_ALL_Q_STACKED

You will also find F_ tables with extended names (_ALL and _SS) but these should NOT be used for data analysis:

If you need to connect 2 or more CommCare tables, you need to create relationships on the following 4 fields (they are named identically among all F_ tables):

  1. Salesforce Site ID
  2. Survey Type (example: Pre-implementation, Post-Implementation, Routine Monitoring)
  3. Data Source (example: CommCare V1 or CommCare V2)
  4.  For Project Wise? (example: Yes, No, N/A)

If you miss any of the 4 fields above, Tableau will create the relationship and a data source, but this data source will produce incorrect results.

See examples of the published data extracts (with scheduled refresh) in Programs Published Data Sources - Tableau Online

If you need to connect one  CommCare table with one or more Salesforce table, you need to create relationship on the following field:

  • Salesforce Site ID

If your goal is to match the survey data for the Sites you found in the CommCare data set to the Salesforce data (for example, population # or Project WISE deployment status), CommCare Table (F_...) must be a Primary data source (left side of the data canvas) and Salesforce Table (SF_...) is a Secondary data source (right side).

If your goal is to add survey results to the Sites you found in the Salesforce data set (for example, water and electricity sufficiency or child club activities in past 6 months), Salesforce Table (SF_...) must be a Primary data source (left side of the data canvas) and CommCare Table (F_...) is a Secondary data source (right side).

Data Source: what is Commcare V1 and Commcare V2

Commcare V2 (v=version) contains data collected using the CommCare app starting October 2021. At that time a new version of the survey was rolled out, with some questions added, removed or replaced compared to V1.

Commcare V1 contains data collected using the CommCare app from 2019 to ~October 2021. The V1 results you see in the data warehouse have been shaped and cleaned to match the current survey version, V2.

As a result, some sites had data collected from 2019 to date in different version of the survey. If you select only 1 of the 2 versions as a Data Source in Tableau filter, the results will be limited (Pre-survey and Post- survey could happen in 2020 -> V1, but RM could happen in 2022 -> V2).