summaryrefslogtreecommitdiffstats
AgeCommit message (Expand)Author
2022-12-09Bump certifi from 2022.6.15 to 2022.12.7 in /requirementsdependabot/pip/requirements/certifi-2022.12.7dependabot[bot]
2022-08-18Fix testing examples by only running them in specialized environmentKarl Lorey
2022-08-18Minor: fix formattingKarl Lorey
2022-08-18Test code in example folder (#28)Leonardo Tarla
2022-07-07Add test for nbsp issue #15Karl Lorey
2022-07-07Try to ignore all html in GitHub's linguistKarl Lorey
2022-07-07Try to ignore static html (again)Karl Lorey
2022-07-07Fix bug where ListScraper mistakenly exits early due to unsimilar roots. Fixe...Karl Lorey
2022-07-07Exclude static html from GitHub language statsKarl Lorey
2022-07-07Cache uniquely_selects to work around re-running train_scrapersKarl Lorey
2022-07-07Use mean instead of sum for similarity to improve understandabilityKarl Lorey
2022-07-07Limit recursive depth of get_similarityKarl Lorey
2022-07-07Cache classes property to avoid re-computationsKarl Lorey
2022-07-06Move functionality to html module and fix minor errors with selector generationKarl Lorey
2022-07-06Minor performance improvementsKarl Lorey
2022-07-06Generate selectors faster by leveraging recursion and cachingKarl Lorey
2022-07-06Cache hashes of soup tagsKarl Lorey
2022-07-05Minor fixes and improvementsKarl Lorey
2022-06-24Bump version to 1.0.0rc3v1.0.0rc3Karl Lorey
2022-06-24Add tests for github profilesKarl Lorey
2022-06-24Avoid matching numbers inside image dimensionsKarl Lorey
2022-06-24Add match similarity to prioritize good match combinationsKarl Lorey
2022-06-24Lazy hashing soup elementsKarl Lorey
2022-06-24Add nth-child selector generationKarl Lorey
2022-06-24Also match all parents that contain the same textKarl Lorey
2022-06-23Add child selectors for CSS generationKarl Lorey
2022-06-23Improve performance by fixing hashing and root computationKarl Lorey
2022-06-23Add attribute-based CSS selectorsKarl Lorey
2022-06-23Improve logging and errors for value matching. Closes #18Karl Lorey
2022-06-23Fix selection of arbitrary text within nodes (for now)Karl Lorey
2022-06-23Fix example formattingKarl Lorey
2022-06-22Check if adding space before code helps in READMEKarl Lorey
2022-06-22Add code to READMEKarl Lorey
2022-06-21Bump version to 1.0.0rc2v1.0.0rc2Karl Lorey
2022-06-21Add exemplary return to quotes exampleKarl Lorey
2022-06-21Update README and add examples folderKarl Lorey
2022-06-21Ignore whitespace around values when searching for matches in HTMLKarl Lorey
2022-06-21Bump version to 1.0.0rc1v1.0.0rc1Karl Lorey
2022-06-21Remove bump2version dependencyKarl Lorey
2022-06-21Set up packaging without MANIFEST.inKarl Lorey
2022-06-21Fix issues with docs build on clean repoKarl Lorey
2022-06-21Revise changelogKarl Lorey
2022-06-21Add release workflowKarl Lorey
2022-06-21Set next version to be 1.0.0rc1Karl Lorey
2022-06-20Fix docsKarl Lorey
2022-06-20Enable python 3.9+ upgrades for pyupgradeKarl Lorey
2022-06-20Set max_line_length to 88Karl Lorey
2022-06-20Rename ci workflow to tests workflowKarl Lorey
2022-06-20Use string= instead of text= for bs4 find_allKarl Lorey
2022-06-20Apply python 3.9+ featuresKarl Lorey