Commit graph

28 commits

Author SHA1 Message Date
Marian Steinbach ab58152b8e
Set chromium version to 123.0.6312.86-r0 (#354) 2024-04-03 15:45:02 +02:00
Marian Steinbach 7d43d75b93
Update auf Chromium v122; Verbesserung am container image build (#339)
* Use Chromium v122

* Pin base image to checksum

* Change package sources, do not use edge/main
2024-03-01 17:51:56 +01:00
Marian Steinbach 0efab3295c
Chromium und Alpine update (#319)
* Update base image to alpine v3.19

* Update Chromium to 121

* Remove libssl1.1 (no longer available in alpine 3.19)

* Add --break-system-packages to pip install commands

* Print debugging info

* Upgrade actions/checkout to v4

* Fix entrypoints and commands

* Upgrade pyOpenSSL to v24.0.0

* Upgrade tenacity to v8
2024-02-21 09:19:33 +01:00
Marian Steinbach 5e723c94db
Make and use a versioned docker image (#279)
* Revert redis module to 4.1.0

* Revert dnspython to 2.1.0

* Revert click to 8.0.3

* Specify alpine 3.16.2, reorganize into multiple steps

* Replace 'latest' with 'main' everywhere

* Fix deprecation warnings

* Add Google root certificates

* Re-order APK packages, write list after installing

* Create VERSION file during docker image build

* Pin chromium version
2022-10-24 21:35:15 +02:00
Marian Steinbach 51052ebac1
Add tests action (#273)
* Add tests action

* Change test execution command

* Change action towards building a docker image

* Use pyyaml from alpine package

* Add test execution

* Enable running without terminal in CI

* Add creation of empty secret file

* Fill service account file

* Fix JSON

* Fix quiting

* Add more fields to fake secret

* Wrap execution in try/except

* Fix: local variable 'result' referenced before assignment
2022-10-21 16:27:15 +02:00
Marian Steinbach a864675d85
Upgrade PyOpenSSL, remove unit tests using TLS 1.0 and 1.1 (#274)
* Upgrade pyOpenSSL to v22.1.0
* Remove unit test using TLS 1.0 and 1.1
* Upgrade base image to alpine 3.15.6
2022-10-18 09:48:25 +02:00
Marian Steinbach 5cbbcc9861
Update alpine to v3.15.0 (#236)
* Set alpine version to 3.15.0

* Revert workaround for py3-cryptography
2021-12-06 21:26:30 +01:00
Marian Steinbach 618e29d763
Job-Verwaltung mit RQ, und vieles mehr (#149)
* CLI: remove 'jobs' command, add 'manager'

* Add job definition

* Move jobs to manage folder

* Rename jobs to manager

* Add rq and redis dependencies

* Add docker-compose YAML

* Downgrade to alpine 3.8

* Adjust paths in Dockerfile, remove entrypoint

* Rename 'make spiderjobs' to 'make jobs'

* Fix docker exectution

* Adapt 'make jobs'

* Fix metadata scheme

* Add docker dependency

* Rendomize queue (a bit)

* Use latest image, remove debug output

* Make docker-compose file downwards-compatible

* Use latest instead of dev image tag

* Update docker-compose.yaml

* Adapt job start script

* Fix redis connection in manager

* Add support for increasing timeout via environment variable

* Adapt load_in_browser to cookies table schema change

* Fix execution

* Mitigate yaml warning

* Bump some dependency versions

* Report resource usage stats for each job

* checks/load_in_browser: Return DOM size, prevent multiple page loads

* Update .dockerignore

* Code update

* Script update

* Update README.md

* WIP

* WIP commit

* Update Dockerfile to alpine:edge and chromium v90

* Update TestCertificateChecker

* Set defaults for __init__ function

* Detect sunflower theme

* Update unit test for new datetime (zero-basing)

* Set logging prefs from Chromium in a new way

* Move datastore client instantiation

As it is not needed for all commands

* Change green-directory repository URL

* Add git settings for cloning green-directory

* Pin alpine version 3.14, fix py3-cryptography

* Use plain docker build progress output

* Add volumes to 'make test' docker run command

* Fix bug

* Update example command in README

* Update dependencies

* Add creation of Kubernetes jobs
2021-11-11 20:15:43 +01:00
Marian Steinbach ff6eae3955
Upgrade to alpine 3.9 (#135)
* Upgrade to alpine 3.9

* Set Python version to 3.7 (#136)
2019-11-24 23:54:26 +01:00
Marian Steinbach b3bb8f34c3
Problembehebung mit Exception-Daten im Ergbebnis, die nicht geschrieben werden können, und Spidern einzelner Sites (#132)
* WIP commit for single job execution

* Convert exception to string

* Pass more arguments

* Move python modules lsit into requirements.txt

* Document single site spidering

* Remove debugging
2019-11-22 23:13:57 +01:00
Marian Steinbach 68f2288617
Prüfe DNS auf IPv6 AAAA Record (#124)
* Add check for IPv6 AAAA record

* Adapt rating/resolvable
2019-07-15 22:59:33 +02:00
Marian Steinbach 0c59111044
Alpine downgrade to 3.8 (#118) 2019-06-04 08:08:53 +02:00
Marian Steinbach 576050d3cd Update alpine repository URLs 2019-06-03 08:12:30 +02:00
Marian Steinbach 9e5426ccde Use alpine 3.9 base image 2019-05-03 23:20:08 +02:00
Marian Steinbach 3b9ead330d
Load feeds and gather info (#103) 2018-12-07 16:32:42 +01:00
Marian Steinbach ae6a2e83e9
Refactor and modularize spider (#70)
See PR description for details
2018-10-03 11:05:42 +02:00
Marian Steinbach 8580747e2a Docker file change from Debian stretch to Alpine 2018-09-12 00:42:40 +02:00
Marian Steinbach e83fd5ecc0 Update selenium to 3.14.0 2018-09-11 23:57:11 +02:00
Marian Steinbach 25e5fc936c Replace PhantomJS with Chromedriver 2018-09-11 23:39:30 +02:00
Marian Steinbach 545ea671d8 Add retry for get_job_from_queue 2018-08-27 22:40:31 +02:00
Marian Steinbach 0d9b44b384 Dockerfile update
- remove unused certifi module
- add google-cloud-datastore module
- add export script
2018-08-23 09:37:53 +02:00
Marian Steinbach 997519df35 More tests 2018-05-04 10:02:01 +02:00
Marian Steinbach 9bea186008 Clean up dependencies 2018-05-04 00:38:28 +02:00
Marian Steinbach 1e4cb2bce8 Run python unit tests in docker container 2018-05-03 12:01:30 +02:00
Marian Steinbach c21511cd62 Add phantomjs 2018-05-03 11:29:21 +02:00
Marian Steinbach 15697016e7 Change docker image to contain chrome 2018-05-03 10:56:04 +02:00
Marian Steinbach 4de5890c60 Change to run spider in Docker container 2018-05-03 10:22:10 +02:00
Marian Steinbach 7ce779e52d Add Dockerfile 2018-05-03 08:54:05 +02:00