Fix OAI-PMH *from* race condition between server and FeederService
An OAI-PMH server may update its data only once per day during a cronjob.
FeederService.feed(..) and BundlesSourceService.updateAfterFeeding(..) currently assume, with respect with their update to the incremental harvesting "from" parameter that the OAI-PMH server data is up-to-date, serving out all OAI records between
from and now. Here lies a race condition, exmplified with the following scenario.
todayAt22hrs: SchedulerConfiguration triggers -> FeederService harvests from lastDayAt22hrs until now (i.e. ~ todayAt22hrs) ... the OAI-PMH server does not report any new records in this timeframe. therefore, FeederServer doesn't get returned any records. FeederServer updates the
from value to
today at 22 hrs.
todayAt23hrs: the daily OAI-PMH server cronjob kicks in and adds to its index **all new records that have appeared between lastDayAt23hrs until now (i.e. ~ todayAt23hrs). It adds new records with their
lastModifiedDate value set to the corresponding time when these records were added during the OAI-PMH server editor's business hours.
This repeats on and on with no records ever being added, and actually records being skipped.
The Solution would be to have
from not being updated to the time of startOfTodaysIncrementalHarvesting, but instead keep (or update) it to the value of the last latest identified
lastModifiedDate encountered in some previous harvesting ... this may result in the same from interval start being queried continously for several days until at least one next record is found - whose lastModifiedDate will be used.