green-spider/Makefile

53 lines
1.3 KiB
Makefile
Raw Normal View History

IMAGE := quay.io/netzbegruenung/green-spider:latest
2018-04-05 19:39:09 +02:00
DB_ENTITY := spider-results
2018-04-05 19:39:09 +02:00
.PHONY: dockerimage spider export
2018-04-05 19:39:09 +02:00
# Build docker image
dockerimage:
Job-Verwaltung mit RQ, und vieles mehr (#149) * CLI: remove 'jobs' command, add 'manager' * Add job definition * Move jobs to manage folder * Rename jobs to manager * Add rq and redis dependencies * Add docker-compose YAML * Downgrade to alpine 3.8 * Adjust paths in Dockerfile, remove entrypoint * Rename 'make spiderjobs' to 'make jobs' * Fix docker exectution * Adapt 'make jobs' * Fix metadata scheme * Add docker dependency * Rendomize queue (a bit) * Use latest image, remove debug output * Make docker-compose file downwards-compatible * Use latest instead of dev image tag * Update docker-compose.yaml * Adapt job start script * Fix redis connection in manager * Add support for increasing timeout via environment variable * Adapt load_in_browser to cookies table schema change * Fix execution * Mitigate yaml warning * Bump some dependency versions * Report resource usage stats for each job * checks/load_in_browser: Return DOM size, prevent multiple page loads * Update .dockerignore * Code update * Script update * Update README.md * WIP * WIP commit * Update Dockerfile to alpine:edge and chromium v90 * Update TestCertificateChecker * Set defaults for __init__ function * Detect sunflower theme * Update unit test for new datetime (zero-basing) * Set logging prefs from Chromium in a new way * Move datastore client instantiation As it is not needed for all commands * Change green-directory repository URL * Add git settings for cloning green-directory * Pin alpine version 3.14, fix py3-cryptography * Use plain docker build progress output * Add volumes to 'make test' docker run command * Fix bug * Update example command in README * Update dependencies * Add creation of Kubernetes jobs
2021-11-11 20:15:43 +01:00
docker build --progress plain -t $(IMAGE) .
Job-Verwaltung mit RQ, und vieles mehr (#149) * CLI: remove 'jobs' command, add 'manager' * Add job definition * Move jobs to manage folder * Rename jobs to manager * Add rq and redis dependencies * Add docker-compose YAML * Downgrade to alpine 3.8 * Adjust paths in Dockerfile, remove entrypoint * Rename 'make spiderjobs' to 'make jobs' * Fix docker exectution * Adapt 'make jobs' * Fix metadata scheme * Add docker dependency * Rendomize queue (a bit) * Use latest image, remove debug output * Make docker-compose file downwards-compatible * Use latest instead of dev image tag * Update docker-compose.yaml * Adapt job start script * Fix redis connection in manager * Add support for increasing timeout via environment variable * Adapt load_in_browser to cookies table schema change * Fix execution * Mitigate yaml warning * Bump some dependency versions * Report resource usage stats for each job * checks/load_in_browser: Return DOM size, prevent multiple page loads * Update .dockerignore * Code update * Script update * Update README.md * WIP * WIP commit * Update Dockerfile to alpine:edge and chromium v90 * Update TestCertificateChecker * Set defaults for __init__ function * Detect sunflower theme * Update unit test for new datetime (zero-basing) * Set logging prefs from Chromium in a new way * Move datastore client instantiation As it is not needed for all commands * Change green-directory repository URL * Add git settings for cloning green-directory * Pin alpine version 3.14, fix py3-cryptography * Use plain docker build progress output * Add volumes to 'make test' docker run command * Fix bug * Update example command in README * Update dependencies * Add creation of Kubernetes jobs
2021-11-11 20:15:43 +01:00
# Fill the queue with spider jobs, one for each site.
jobs:
2018-08-23 09:38:30 +02:00
docker run --rm -ti \
-v $(PWD)/secrets:/secrets \
$(IMAGE) \
Job-Verwaltung mit RQ, und vieles mehr (#149) * CLI: remove 'jobs' command, add 'manager' * Add job definition * Move jobs to manage folder * Rename jobs to manager * Add rq and redis dependencies * Add docker-compose YAML * Downgrade to alpine 3.8 * Adjust paths in Dockerfile, remove entrypoint * Rename 'make spiderjobs' to 'make jobs' * Fix docker exectution * Adapt 'make jobs' * Fix metadata scheme * Add docker dependency * Rendomize queue (a bit) * Use latest image, remove debug output * Make docker-compose file downwards-compatible * Use latest instead of dev image tag * Update docker-compose.yaml * Adapt job start script * Fix redis connection in manager * Add support for increasing timeout via environment variable * Adapt load_in_browser to cookies table schema change * Fix execution * Mitigate yaml warning * Bump some dependency versions * Report resource usage stats for each job * checks/load_in_browser: Return DOM size, prevent multiple page loads * Update .dockerignore * Code update * Script update * Update README.md * WIP * WIP commit * Update Dockerfile to alpine:edge and chromium v90 * Update TestCertificateChecker * Set defaults for __init__ function * Detect sunflower theme * Update unit test for new datetime (zero-basing) * Set logging prefs from Chromium in a new way * Move datastore client instantiation As it is not needed for all commands * Change green-directory repository URL * Add git settings for cloning green-directory * Pin alpine version 3.14, fix py3-cryptography * Use plain docker build progress output * Add volumes to 'make test' docker run command * Fix bug * Update example command in README * Update dependencies * Add creation of Kubernetes jobs
2021-11-11 20:15:43 +01:00
python cli.py \
--credentials-path /secrets/datastore-writer.json \
--loglevel debug \
manager
2018-08-23 09:38:30 +02:00
# Run spider in docker image
spider:
2018-05-25 19:10:54 +02:00
docker run --rm -ti \
2019-06-03 08:08:55 +02:00
-v $(PWD)/volumes/dev-shm:/dev/shm \
2018-08-23 10:02:34 +02:00
-v $(PWD)/secrets:/secrets \
2019-06-03 08:08:55 +02:00
-v $(PWD)/volumes/chrome-userdir:/opt/chrome-userdir \
2019-11-22 08:39:56 +01:00
--shm-size=2g \
$(IMAGE) \
2018-08-23 10:02:34 +02:00
--credentials-path /secrets/datastore-writer.json \
--loglevel debug \
spider --kind $(DB_ENTITY) ${ARGS}
export:
docker run --rm -ti \
-v $(PWD)/secrets:/secrets \
2019-06-03 08:08:55 +02:00
-v $(PWD)/volumes/json-export:/json-export \
$(IMAGE) \
--credentials-path /secrets/datastore-reader.json \
--loglevel debug \
export --kind $(DB_ENTITY)
2018-04-05 19:39:09 +02:00
2018-08-27 21:17:04 +02:00
# run spider tests
test:
docker run --rm -ti \
Job-Verwaltung mit RQ, und vieles mehr (#149) * CLI: remove 'jobs' command, add 'manager' * Add job definition * Move jobs to manage folder * Rename jobs to manager * Add rq and redis dependencies * Add docker-compose YAML * Downgrade to alpine 3.8 * Adjust paths in Dockerfile, remove entrypoint * Rename 'make spiderjobs' to 'make jobs' * Fix docker exectution * Adapt 'make jobs' * Fix metadata scheme * Add docker dependency * Rendomize queue (a bit) * Use latest image, remove debug output * Make docker-compose file downwards-compatible * Use latest instead of dev image tag * Update docker-compose.yaml * Adapt job start script * Fix redis connection in manager * Add support for increasing timeout via environment variable * Adapt load_in_browser to cookies table schema change * Fix execution * Mitigate yaml warning * Bump some dependency versions * Report resource usage stats for each job * checks/load_in_browser: Return DOM size, prevent multiple page loads * Update .dockerignore * Code update * Script update * Update README.md * WIP * WIP commit * Update Dockerfile to alpine:edge and chromium v90 * Update TestCertificateChecker * Set defaults for __init__ function * Detect sunflower theme * Update unit test for new datetime (zero-basing) * Set logging prefs from Chromium in a new way * Move datastore client instantiation As it is not needed for all commands * Change green-directory repository URL * Add git settings for cloning green-directory * Pin alpine version 3.14, fix py3-cryptography * Use plain docker build progress output * Add volumes to 'make test' docker run command * Fix bug * Update example command in README * Update dependencies * Add creation of Kubernetes jobs
2021-11-11 20:15:43 +01:00
-v $(PWD)/volumes/dev-shm:/dev/shm \
-v $(PWD)/secrets:/secrets \
-v $(PWD)/screenshots:/screenshots \
2019-06-03 08:08:55 +02:00
-v $(PWD)/volumes/chrome-userdir:/opt/chrome-userdir \
--entrypoint "python3" \
$(IMAGE) \
-m unittest discover -p '*_test.py' -v