index
:
mlscraper
dependabot/pip/requirements/certifi-2022.12.7
dependabot/pip/requirements/cryptography-39.0.1
dependabot/pip/requirements/wheel-0.38.1
develop
master
Mirror of https://github.com/lorey/mlscraper
matthias
summary
refs
log
tree
commit
diff
stats
log msg
author
committer
range
Age
Commit message (
Expand
)
Author
2023-02-08
Bump cryptography from 37.0.2 to 39.0.1 in /requirements
dependabot/pip/requirements/cryptography-39.0.1
dependabot[bot]
2023-01-17
Fix status badge in README
Karl Lorey
2022-08-18
Fix testing examples by only running them in specialized environment
Karl Lorey
2022-08-18
Minor: fix formatting
Karl Lorey
2022-08-18
Test code in example folder (#28)
Leonardo Tarla
2022-07-07
Add test for nbsp issue #15
Karl Lorey
2022-07-07
Try to ignore all html in GitHub's linguist
Karl Lorey
2022-07-07
Try to ignore static html (again)
Karl Lorey
2022-07-07
Fix bug where ListScraper mistakenly exits early due to unsimilar roots. Fixe...
Karl Lorey
2022-07-07
Exclude static html from GitHub language stats
Karl Lorey
2022-07-07
Cache uniquely_selects to work around re-running train_scrapers
Karl Lorey
2022-07-07
Use mean instead of sum for similarity to improve understandability
Karl Lorey
2022-07-07
Limit recursive depth of get_similarity
Karl Lorey
2022-07-07
Cache classes property to avoid re-computations
Karl Lorey
2022-07-06
Move functionality to html module and fix minor errors with selector generation
Karl Lorey
2022-07-06
Minor performance improvements
Karl Lorey
2022-07-06
Generate selectors faster by leveraging recursion and caching
Karl Lorey
2022-07-06
Cache hashes of soup tags
Karl Lorey
2022-07-05
Minor fixes and improvements
Karl Lorey
2022-06-24
Bump version to 1.0.0rc3
v1.0.0rc3
Karl Lorey
2022-06-24
Add tests for github profiles
Karl Lorey
2022-06-24
Avoid matching numbers inside image dimensions
Karl Lorey
2022-06-24
Add match similarity to prioritize good match combinations
Karl Lorey
2022-06-24
Lazy hashing soup elements
Karl Lorey
2022-06-24
Add nth-child selector generation
Karl Lorey
2022-06-24
Also match all parents that contain the same text
Karl Lorey
2022-06-23
Add child selectors for CSS generation
Karl Lorey
2022-06-23
Improve performance by fixing hashing and root computation
Karl Lorey
2022-06-23
Add attribute-based CSS selectors
Karl Lorey
2022-06-23
Improve logging and errors for value matching. Closes #18
Karl Lorey
2022-06-23
Fix selection of arbitrary text within nodes (for now)
Karl Lorey
2022-06-23
Fix example formatting
Karl Lorey
2022-06-22
Check if adding space before code helps in README
Karl Lorey
2022-06-22
Add code to README
Karl Lorey
2022-06-21
Bump version to 1.0.0rc2
v1.0.0rc2
Karl Lorey
2022-06-21
Add exemplary return to quotes example
Karl Lorey
2022-06-21
Update README and add examples folder
Karl Lorey
2022-06-21
Ignore whitespace around values when searching for matches in HTML
Karl Lorey
2022-06-21
Bump version to 1.0.0rc1
v1.0.0rc1
Karl Lorey
2022-06-21
Remove bump2version dependency
Karl Lorey
2022-06-21
Set up packaging without MANIFEST.in
Karl Lorey
2022-06-21
Fix issues with docs build on clean repo
Karl Lorey
2022-06-21
Revise changelog
Karl Lorey
2022-06-21
Add release workflow
Karl Lorey
2022-06-21
Set next version to be 1.0.0rc1
Karl Lorey
2022-06-20
Fix docs
Karl Lorey
2022-06-20
Enable python 3.9+ upgrades for pyupgrade
Karl Lorey
2022-06-20
Set max_line_length to 88
Karl Lorey
2022-06-20
Rename ci workflow to tests workflow
Karl Lorey
2022-06-20
Use string= instead of text= for bs4 find_all
Karl Lorey
[next]