Marian Steinbach
ab942ca91d
Fix full JSON export
2019-05-04 23:00:00 +02:00
Marian Steinbach
9e5426ccde
Use alpine 3.9 base image
2019-05-03 23:20:08 +02:00
Marian Steinbach
cff4d55f17
Fix problem where pageload was not counted
2019-05-03 22:54:25 +02:00
Marian Steinbach
7621b7ef75
Remove debugging output
2019-05-03 22:54:05 +02:00
Marian Steinbach
56f9f1ba86
Check third party cookies
2019-04-29 10:09:25 +02:00
Marian Steinbach
5e8347916c
Fehlerbehebung im url_reachability check ( #108 )
...
* Fix detection of redirects to bad domains
* Fix bad domain check
* Add --url flag to spider for faster debugging
* Pass args to make spider
* Add spidering of a single URL for debugging purposes
* Fix tests
* Fix test in CI
* Remove pip upgrade
2019-04-19 00:35:28 +02:00
Marian Steinbach
2dfcf61cc0
Add netzbegruenung/green-spider-indexer to README
2019-04-11 22:43:16 +02:00
Marian Steinbach
16a05b751b
Several fixes for edge cases
2018-12-17 23:54:09 +01:00
Marian Steinbach
3b8328d804
Fixing several bugs in spider code
2018-12-17 17:31:09 +01:00
Marian Steinbach
3b9ead330d
Load feeds and gather info ( #103 )
2018-12-07 16:32:42 +01:00
Marian Steinbach
3063a4488d
Detect frameset ( #102 )
...
* Add frameset checker
* Remove unused variable (unrelated)
2018-12-07 16:31:56 +01:00
Marian Steinbach
deff95306b
Extend CMS detection for Urwahl3000 theme ( #96 )
...
* Extend check for Urwahl3000 theme
* Remove unused import
2018-12-05 21:27:45 +01:00
Marian Steinbach
d0e3a4210f
Fix link raters (social media links, contact link) ( #95 )
...
* Fix rating for contact_link and social_media_link
* Skip checks when dependencies not met
2018-11-28 23:46:40 +01:00
Marian Steinbach
eac5feb4f5
Kubernetes manifests: replace jobs with cronjobs
2018-11-28 22:19:03 +01:00
Marian Steinbach
678f319e73
Detect two more specific generators
2018-11-28 22:02:30 +01:00
Marian Steinbach
39cba1595a
Fix contact link rating
2018-11-23 22:16:26 +01:00
Marian Steinbach
3ba6940e94
Add criteria: social media links, contact link ( #90 )
...
* Add hyperlink checker
* Add rating for contact and social media links
* Update a comment
* Remove hyperlinks details from final payload
2018-11-20 22:47:34 +01:00
Marian Steinbach
4524cb5714
Consider site reachable only with status code < 400 ( #89 )
2018-11-20 20:14:52 +01:00
Marian Steinbach
c03ff21a9c
Simplify export ( #88 )
...
* Simplify exports
* Create file output in current working directory
2018-11-20 20:00:47 +01:00
Marian Steinbach
38481236ca
Add webapp deployment ( #87 )
...
* Add webapp deployment script
* Add some docs for webapp
* Some fixes in run-job.sh
* Update webapp deployment script
* Add some kubernetes job manifests
* Create index.yaml
* Remove local creation of the docker image from targets
* Update README.md
2018-11-20 19:54:23 +01:00
Marian Steinbach
924981659b
Allow Titillium together with Arvo ( #78 )
2018-11-05 23:18:11 +01:00
Marian Steinbach
325caee2bb
Detect generator jimdo ( #81 )
2018-11-05 23:00:01 +01:00
Marian Steinbach
df1f0bb452
Detect Drupal ( #80 )
2018-11-05 22:32:06 +01:00
Marian Steinbach
8ce8768465
Improve error messages in export
2018-10-08 08:42:29 +02:00
Marian Steinbach
0538e437ea
Fix NoneType error in rater responsive_layout
2018-10-07 21:14:29 +02:00
Marian Steinbach
18be6e7adf
Fix rater no_network_errors
2018-10-07 20:55:57 +02:00
Marian Steinbach
4251df6b06
Fixes for two problems found during spidering ( #75 )
2018-10-05 10:25:05 +02:00
Marian Steinbach
fd4a29da8e
Collect cookies in load_in_browser check ( #74 )
2018-10-04 21:21:30 +02:00
Marian Steinbach
2945372aaf
Fix README and Makefile ( #72 )
2018-10-04 21:20:53 +02:00
Marian Steinbach
c065da4957
More unittests for checks ( #73 )
...
* Add test for dns_resolution
* Add test for domain_variations
* Add test for duplicate_content
2018-10-03 22:43:22 +02:00
Marian Steinbach
57f8dea4e0
Improve certificate check to support SNI ( #71 )
...
* Fix the certificate check to support SNI
* Better tests for the certificate check
* Activate verbose output when running make test
* Add commenting on the spider test
2018-10-03 21:01:52 +02:00
Marian Steinbach
ae6a2e83e9
Refactor and modularize spider ( #70 )
...
See PR description for details
2018-10-03 11:05:42 +02:00
Marian Steinbach
7514aeb542
Add date and time to spider result export
2018-09-19 23:04:09 +02:00
Marian Steinbach
220b06fe79
Fix data export logging bug
2018-09-17 17:35:21 +02:00
Marian Steinbach
22d158c8b8
Fix docker image name in Makefile
2018-09-12 09:23:10 +02:00
Marian Steinbach
e64383c899
Add dev-shm to ignored list
2018-09-12 09:21:51 +02:00
Marian Steinbach
634ffa4a23
Merge pull request #67 from netzbegruenung/remove-phantomjs
...
Replace PhantomJS with Chromedriver
2018-09-12 09:02:52 +02:00
Marian Steinbach
8580747e2a
Docker file change from Debian stretch to Alpine
2018-09-12 00:42:40 +02:00
Marian Steinbach
eb9a29ac1c
Tweak chromedriver usage
2018-09-11 23:58:53 +02:00
Marian Steinbach
e83fd5ecc0
Update selenium to 3.14.0
2018-09-11 23:57:11 +02:00
Marian Steinbach
25e5fc936c
Replace PhantomJS with Chromedriver
2018-09-11 23:39:30 +02:00
Marian Steinbach
54b6d24b61
Update README.md
2018-08-28 22:39:19 +02:00
Marian Steinbach
d5e8f453aa
Merge pull request #65 from netzbegruenung/remove-webapp-code
...
Remove webapp code
2018-08-28 22:38:25 +02:00
Marian Steinbach
c642fd68fb
Update README
2018-08-28 22:28:40 +02:00
Marian Steinbach
fda97574cb
Remove webapp code
2018-08-28 22:28:34 +02:00
Marian Steinbach
fea2ce15b5
Merge pull request #64 from netzbegruenung/enable-screenshot-export
...
Enable screenshot export
2018-08-28 21:36:03 +02:00
Marian Steinbach
199d6da324
Uncomment screenshot export
2018-08-28 21:35:38 +02:00
Marian Steinbach
f2025a9476
Merge pull request #63 from netzbegruenung/add-license
...
Create LICENSE
2018-08-28 21:04:50 +02:00
Marian Steinbach
627384d1af
Merge pull request #62 from netzbegruenung/webapp-entry-types
...
Add support for RV and BV entry type display
2018-08-28 21:04:37 +02:00
Marian Steinbach
790faa20e3
Create LICENSE
2018-08-28 21:04:20 +02:00