How to handle for vary time period #1105
Replies: 1 comment
-
|
Hello @jayjoshi33, Thank you for contacting us! The amount of data required to train a Meridian model is subjective, it depends on how many parameters need to be estimated, whether the model being built is a geo or national level model, what is the time granularity during the modeling window (weekly or monthly model), and the data quality itself. It's important to note that Meridian models often benefit from more extensive data. We typically recommend using a minimum of two years' worth of weekly data for geo-level models and three years' data for national-level models. You may refer to our documentation on Amount of data needed for a better understanding of each of these considerations. Coming to your specific situation, if the data unavailability is due to lack of media activity in that channel, then we would recommend outer join and using zeros for the channel during the period it doesn’t have any media activity. This would accurately represent the actual practical scenario of the media activity and the drivers of the KPI and be beneficial in the modeling process. However, if there was media activity and you don’t have the actual data for the channel, then you may choose to consider using approximate values or choosing alternative media exposure metrics, i.e., if you don’t have impressions or clicks data for that channel, but still have media spend during this period, it may be used as the exposure metric instead of impressions or clicks. This avoids shrinking the modeling window. Using an inner join and omitting the period during which data isn’t available should be considered as the last resort. This may still work if the data available is of good quality, especially if you are running a geo-level model, but you may run into model convergence issues. In this case, using informative priors becomes crucial to the modeling process. Do reach out if you have any further questions regarding this. Google Meridian Support Team |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I have national-level data for two channels. One channel has 2 years of data, while the other has 1 year and 4 months. In this case, which approach is better for modeling: using an inner join, which will limit the dataset to 1 year and 4 months for both channels, or using an outer join, which keeps the full 2 years of data but introduces many zeros in the channel with less data?
Beta Was this translation helpful? Give feedback.
All reactions