index
:
mlscraper
dependabot/pip/requirements/certifi-2022.12.7
dependabot/pip/requirements/cryptography-39.0.1
dependabot/pip/requirements/wheel-0.38.1
develop
master
Mirror of https://github.com/lorey/mlscraper
matthias
summary
refs
log
tree
commit
diff
stats
log msg
author
committer
range
Age
Commit message (
Expand
)
Author
2022-06-20
Re-implement selector generation with a speedup >10x
develop
Karl Lorey
2022-06-19
Speed up training by filtering overlapping matches and preferring deep matches
Karl Lorey
2022-06-19
Fix css selector generation by adding tag name and avoiding empty selector
Karl Lorey
2022-06-17
Fix ListScraper and introduce maximum complexity parameter
Karl Lorey
2022-06-15
Revert previous commit and add comment why
Karl Lorey
2022-06-15
Loop through all possible selectors when training ListScraper
Karl Lorey
2022-06-15
Rewrite training module to decrease complexity
Karl Lorey
2022-06-15
Implement a factory for each page to solve identity and equality issues for now
Karl Lorey
2022-06-14
I might go insane with this one
Karl Lorey
2022-06-14
Use stackoverflow fixture throughout all tests
Karl Lorey
2022-06-14
Read stackoverflow sample with rb to get actual bytes
Karl Lorey
2022-06-14
Add missing conftest.py
Karl Lorey
2022-06-14
Pull stackoverflow test sample to module level
Karl Lorey
2022-06-13
Try to install lmxl dependendencies during CI
Karl Lorey
2022-06-13
Adapt python versions to 3.9+
Karl Lorey
2022-06-13
Swtich from travis to Github Actions, pre-commit, and tox
Karl Lorey
2022-06-13
Apply black and other styling
Karl Lorey
2022-06-13
Import mlscraper-experiments
Karl Lorey
2020-10-03
Circumvent aggressive caching to display updated badges
Karl Lorey
2020-09-29
I have to start testing example code...
Karl Lorey
2020-09-28
Use a wide image in readme to save space
Karl Lorey
2020-09-28
Fix readme code sample
Karl Lorey
2020-09-28
Update readme with rule-based scraper as this is easier to get going
Karl Lorey
2020-09-28
Update readme with improved description
Karl Lorey
2020-09-27
Update examples with stackoverflow scraper
Karl Lorey
2020-09-27
Add upcoming headline to HISTORY.rst and add bump2version rule
Karl Lorey
2020-09-27
Update README with badges and installation instructions
Karl Lorey
2020-09-27
Bump version: 0.1.1 → 0.1.2
v0.1.2
Karl Lorey
2020-09-27
Fix history formatting to prevent pypi rejection
Karl Lorey
2020-09-27
Bump version: 0.1.0 → 0.1.1
v0.1.1
Karl Lorey
2020-09-27
Fix username for token authentication in travis config
Karl Lorey
2020-09-27
Bump version: 0.0.0 → 0.1.0
v0.1.0
Karl Lorey
2020-09-27
Adapt bumpversion to black formatting
Karl Lorey
2020-09-27
Set verion to 0.0.0 to use bumpversion
Karl Lorey
2020-09-27
Add travis badge to readme
Karl Lorey
2020-09-27
Fix missing lxml requirement
Karl Lorey
2020-09-27
Add PyPI credentials
Karl Lorey
2020-09-27
Prepare python package with cookiecutter template
Karl Lorey
2020-09-26
Increase log level for make-based test runs
Karl Lorey
2020-09-26
Add test for id-based selectors
Karl Lorey
2020-09-26
Use heuristic to prefer simple rules in single-item scraper
Karl Lorey
2020-09-26
Skip unstable test
Karl Lorey
2020-09-26
Fix quote_to_scrape example
Karl Lorey
2020-09-26
Simplify interfaces even more (plain html only)
Karl Lorey
2020-09-26
Update README with new interfaces
Karl Lorey
2020-09-26
Unify interfaces
Karl Lorey
2020-09-26
Generate id and child-based selectors
Karl Lorey
2020-09-26
Make rule-based selection faster and more robust
Karl Lorey
2020-09-26
Add test set generation for single item pages
Karl Lorey
2020-09-25
Improved basic scraper to be fully functional
Karl Lorey
[next]