Components ========== The popularity analysis relies on a lot of components ************** StorageUsageDB ************** Despite its name, that is where both the StorageUsageAgent and the PopularityAgent stores their data. It is exposed via the StorageUsageHandler and the DataUsageHandler ***************** StorageUsageAgent ***************** This agent scans the DFC and stores the size and number of files per directory and per StorageElement in the StorageUsageDB. ******************* StorageHistoryAgent ******************* This agent crawls the StorageUsageDB, convert each directory into a bookkeeping path and fill in the following accounting: * Storage: space used/free per storage and/or directory * Data storage: spaced used per bookkeeping path * user storage: like Storage, but for user directories **************** DataUsageHandler **************** This service is called by the jobs to declare their use of a given directory. It is stored per directory and per day. *************** PopularityAgent *************** This agent goes through the StorageUsageDB and creates accounting entries for the popularity. It also caches the BK dictionary for each directory in the StorageUSageDB. ************** DataPop server ************** Yandex provided service that consumes our popularity CSV and make prediction on which dataset to remove. It is ran on our mesos cluster: https://lbmesosms02.cern.ch/marathon/ui/#/apps/%2Fdatapopserv *********************** PopularityAnalysisAgent *********************** This agents creates two files: * one CSV containing a summary of the popularity (see :ref:`popularityCSV` ). * one CSV, generated from the first one through the DataPop server