Merge branch 'release/0.8.1'v0.8.1

author: Edwin van Leeuwen <edwinvanl@tuta.io> 2022-02-26 09:35:20 +0000
committer: Edwin van Leeuwen <edwinvanl@tuta.io> 2022-02-26 09:35:20 +0000
commit: 83f5a35a3330c0c2d82853186e369c22d963d357 (patch)
tree: d7b03fde7361c8a38d67923eaa40816509c2cc67
parent: d69f1b8f154a42f9b440b8d0885b9559e54c6753 (diff)
parent: 024e3e066c3dd7cd46023dde42237768ad2f46fe (diff)
19 files changed, 441 insertions, 314 deletions
diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index f5fccca..c458af9 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -15,8 +15,9 @@ build:arch:
   script:
     - yay -Sy --noconfirm cmake git cpr nlohmann-json pugixml
     - cd aur
-    - echo $CI_COMMIT_BRANCH
-    - sed -i 's:PLACEHOLDER:'$CI_COMMIT_BRANCH':' PKGBUILD
+# If this does not work, then try CI_COMMIT_REF_NAME
+    - echo $CI_COMMIT_REF_SLUG
+    - sed -i 's:PLACEHOLDER:'$CI_COMMIT_REF_SLUG':' PKGBUILD
     - makepkg PKGBUILD
   artifacts:
     paths:
diff --git a/CMakeLists.txt b/CMakeLists.txt
index ebe7e2d..e25afb1 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -100,6 +100,7 @@ target_compile_options(${TARGET} PRIVATE -Wall -Wextra -Wpedantic -Werror)
 
 install(TARGETS ${TARGET} RUNTIME DESTINATION bin)
 
+if (${CMAKE_BUILD_TYPE} STREQUAL "Debug")
 FILE(GLOB TESTFILES test/catch_*.cpp)
 foreach(TESTFILE ${TESTFILES})
   get_filename_component(NAME ${TESTFILE} NAME_WE) 
@@ -120,3 +121,4 @@ foreach(TESTFILE ${TESTFILES})
     )
   endif()
 endforeach()
+endif()
diff --git a/README.md b/README.md
index fed4f91..348a1dd 100644
--- a/README.md
+++ b/README.md
@@ -4,6 +4,16 @@ Read-the-things-tui (rttt) lets you read the things from the terminal. The thing
 
 ## Installing
 
+### Arch
+
+'rttt' is included in the 'aur' repository, so you can install it from there:
+
+```bash
+yay -S rttt-git
+```
+
+### From source
+
 ```bash
 git clone https://gitlab.com/BlackEdder/rttt.git 
 cd rttt 
diff --git a/TODO.md b/TODO.md
index be0a623..0949d1b 100644
--- a/TODO.md
+++ b/TODO.md
@@ -8,6 +8,8 @@ This is a very rough list of ideas and future todo items.
 
 ## Refactor
 
+- [ ] Move to iterator flat_view for getComments etc.
+  - All mark_active/mark_for_update should be the responsibility of drawItems? or getComments? Make a decision
 - [ ] Make reddit responsible for itself (state() to setup)
   - When using config just read it and write it when changed
 - [ ] Reddit retrieve access token should ideally be async?
@@ -22,25 +24,14 @@ This is a very rough list of ideas and future todo items.
     - https://oleksandrkvl.github.io/2021/04/02/cpp-20-overview.html#three-way-comparison (should we imply strict ordering)?
     - Seems like we don't need an equivalence operator (`==`) https://stackoverflow.com/questions/20168173/when-using-stdmap-should-i-overload-operator-for-the-key-type
 - [ ] Proper iterator for active_storage, which only exposes the key, value not the value_wrapper
-- [ ] Different containers where appropriate. Note that drawItems only expects iterators. Does not rely on indexable anywhere, so we can easily pass non vectors to it. I.e. in rss.hpp a set with custom sort order might be better.
 - [ ] Consistent naming style. Let's try the standard library style -> underscore and lowercase everywhere, use namespaces to differentiate: request::state state; etc.
 - [ ] Use raw literals wherever possible? https://www.geeksforgeeks.org/raw-string-literal-c/
-- [ ] Get rid of linearise either by loading kids for reddit similar to current hackernews implementation or writing a nice iterator (see the good youtube video I saved on newpipe)
  
 ## Features
 
-- [x] Prepare for AUR package (after the crashing but is fixed)
-    - [x] Create a PKGBUILD file: https://wiki.archlinux.org/title/Arch_package_guidelines
-    - [x] Add pkgbuild test into gitlab-ci
-      - https://stackoverflow.com/questions/33387622/multiple-docker-images-in-gitlab-ci-yml      - testcab/yay package has yay installed
-      - https://gitlab.com/gitlab-org/gitlab-foss/-/issues/3246
-      - https://stackoverflow.com/questions/43655704/how-to-build-archlinux-pkgbuild-inside-docker-with-gitlab-ci
-      - Remove _deps before running tests, make sure we are not linked to there
-      - Can we have test depend on specific build: https://stackoverflow.com/questions/68877379/how-to-define-a-gitlab-ci-job-to-depend-on-either-one-or-one-another-previous-jo (dependencies)
-      - Use sed on the PKGBUILD to point to the current branch that the test is running for
-- [ ] Submit AUR pkg to arch
-    - [ ] Get an account on aur with publickey: https://wiki.archlinux.org/title/AUR_submission_guidelines (section on authentication)
-    - [ ] Fix default git path (to develop branch) in makepkg
+- [ ] Improve error handling Reddit
+    - Non exiting paths
+    - Refresh after requesting new access token
 - [ ] Always add /r/front to the tab completion
 - [ ] Reddit: it might be possible to get default subreddits from /subreddits when not logged in. We could use that to get tab completion working for those. Note that ideally we would then remove them again on login
 - [ ] Save state to .cache/rttt? 
@@ -65,12 +56,10 @@ This is a very rough list of ideas and future todo items.
 - [ ] Reddit Voting
 - [ ] Reddit Posting of comments
 - [ ] Better parseHTML
-- [ ] Parse markdown url better [url](url)
+- [ ] Support reddit subreddits in the 'o' menu. I.e. if someone mentions r/neovim, then it should be possible to open /r/neovim in rttt
+- [ ] Twitter access?
 
 ## Bug
 
-- [x] If I press j/k a lot before a page has loaded (subreddit) then it seems to crash
-- [x] Message queue should still empty even if status window is closed
-- [x] parseHTML should be used for RSS as well
 - [ ] Crash after suspend (long time of inactivity)
 - [ ] If we are in non text mode and we press v, while on a large_item we end up with the highlighted item off screen. Presumably because scrolltop is wrong
diff --git a/aur/PKGBUILD b/aur/PKGBUILD
index 0caab9d..c0441eb 100644
--- a/aur/PKGBUILD
+++ b/aur/PKGBUILD
@@ -3,7 +3,7 @@ pkgname=rttt-git
 _pkgname='rttt'
 pkgrel=1
 pkgver=v0.6.0.r8.b4e22d4
-pkgdesc="Read-the-things-tui (rttt) lets you read the things from the terminal. The things currently include RSS/Atom, hackernews and Reddit."
+pkgdesc="Read-the-things-tui (rttt) lets you read RSS/Atom, hackernews and Reddit from the terminal."
 arch=("x86_64")
 url="https://gitlab.com/BlackEdder/rttt"
 license=('GPL3')
diff --git a/src/config.hpp b/src/config.hpp
index cd01631..659b64a 100644
--- a/src/config.hpp
+++ b/src/config.hpp
@@ -22,7 +22,7 @@ inline std::filesystem::path getPath() {
 
 inline std::filesystem::path getFilePath() { return getPath() / "config.json"; }
 
-inline nlohmann::json defaultConfig() {
+nlohmann::json defaultConfig() {
   nlohmann::json config;
   config["reddit"]["client_id"] = "8yDBiibHONI95SeMWLZspg";
   config["reddit"]["device_id"] = rttt::random_string(25);
@@ -52,16 +52,12 @@ inline nlohmann::json load() {
   } else {
     FILE *pFile;
     pFile = fopen(fpath.c_str(), "r");
-    config = nlohmann::json::parse(pFile);
-    // backward compatibility
-    bool changed = false;
-    // TODO: can we merge loaded config with defaultConfig?
-    if (!config["reddit"].contains("refresh_token")) {
-      config["reddit"]["refresh_token"] = "";
-      changed = true;
-    }
-    if (changed)
-      config::save(config);
+    auto saved_config = nlohmann::json::parse(pFile);
+
+    // Make sure config contains all expected values
+    config = defaultConfig();
+    // Overwrite with saved values
+    config.update(saved_config);
   }
   return config;
 }
diff --git a/src/hackernews.hpp b/src/hackernews.hpp
index a244f6a..64f1fb9 100644
--- a/src/hackernews.hpp
+++ b/src/hackernews.hpp
@@ -2,6 +2,7 @@
 
 #include <map>
 #include <queue>
+#include <ranges>
 #include <set>
 #include <string>
 #include <unordered_set>
@@ -58,11 +59,12 @@ struct PollOpt {
 
 using ItemContainer = rttt::active_storage<int, rttt::ItemVariant>;
 
-inline std::vector<std::pair<size_t, rttt::ItemVariant>>
-get_kids_with_indent(const rttt::ItemVariant &hnitem, size_t indent,
-                     ItemContainer &known_hnitems,
-                     const rttt::active_storage<std::string, ui::item_view_state> &item_view_states,
-                     int head) {
+inline std::vector<std::pair<size_t, rttt::ItemVariant>> get_kids_with_indent(
+    const rttt::ItemVariant &hnitem, size_t indent,
+    ItemContainer &known_hnitems,
+    const rttt::active_storage<std::string, ui::item_view_state>
+        &item_view_states,
+    int head) {
 
   std::vector<std::pair<size_t, rttt::ItemVariant>> items;
   if (head <= 0)
@@ -75,8 +77,9 @@ get_kids_with_indent(const rttt::ItemVariant &hnitem, size_t indent,
     kids = std::get<hackernews::Comment>(hnitem).kids;
   else if (std::holds_alternative<rttt::Unknown>(hnitem))
     return items;
-  else
+  else {
     assert(false);
+  }
 
   for (auto &id : kids) {
     if (!known_hnitems.contains(id)) {
@@ -271,8 +274,9 @@ struct state {
         rttt::request::try_pop_and_retrieve(pathRequestCache);
     while (maybe_receive_path.has_value()) {
       auto &pair = maybe_receive_path.value();
-      if (pair.first == "/hn/updates")
+      if (pair.first == "/hn/updates") {
         assert(false);
+      }
       ItemIds ids = nlohmann::json::parse(pair.second);
       idsMap[pair.first] = std::move(ids);
       updated = true;
@@ -357,13 +361,30 @@ struct state {
 private:
 };
 
-
 std::set<std::string> get_known_paths(const hackernews::state &state) {
   std::set<std::string> set;
-  for (auto & pair : state.uriMap)
+  for (auto &pair : state.uriMap)
     set.insert(pair.first);
   return set;
 }
 
+template <typename T>
+auto get_stories(T &&items, const std::vector<int> &item_ids) {
+  auto v = item_ids | std::views::filter([&items](int id) {
+             if (items.contains(id) &&
+                 !std::holds_alternative<hackernews::Story>(items.at(id)))
+               return false;
+             return true;
+           }) |
+           std::views::transform([&items](int id) {
+             if (!items.contains(id)) {
+               items.insert({id, hackernews::Story()});
+             } else {
+               items.mark_active(id);
+             }
+             return std::pair(0, items.at(id));
+           });
+  return std::pair(std::move(items), v);
+}
 } // namespace hackernews
 } // namespace rttt
diff --git a/src/item.hpp b/src/item.hpp
index ee02528..ae54ed2 100644
--- a/src/item.hpp
+++ b/src/item.hpp
@@ -1,5 +1,7 @@
 #pragma once
 
+#include <cassert>
+#include <optional>
 #include <string>
 #include <variant>
 #include <vector>
@@ -85,4 +87,107 @@ struct Comment {
 
 } // namespace reddit
 
+inline std::optional<std::string> try_id_string(const ItemVariant &item) {
+  if (std::holds_alternative<reddit::Story>(item)) {
+    auto value = std::get<reddit::Story>(item);
+    if (value.id.empty())
+      return std::nullopt;
+    return value.id;
+  }
+  if (std::holds_alternative<reddit::Comment>(item)) {
+    auto value = std::get<reddit::Comment>(item);
+    if (value.id.empty())
+      return std::nullopt;
+    return value.id;
+  }
+  if (std::holds_alternative<hackernews::Story>(item)) {
+    auto value = std::get<hackernews::Story>(item);
+    if (value.id == 0)
+      return std::nullopt;
+    return std::to_string(value.id);
+  }
+  if (std::holds_alternative<hackernews::Comment>(item)) {
+    auto value = std::get<hackernews::Comment>(item);
+    if (value.id == 0)
+      return std::nullopt;
+    return std::to_string(value.id);
+  }
+  if (std::holds_alternative<rss::Story>(item)) {
+    auto value = std::get<rss::Story>(item);
+    if (value.id == 0)
+      return std::nullopt;
+    return std::to_string(value.id);
+  }
+  return std::nullopt;
+}
+
+inline std::optional<uint64_t> try_time(const ItemVariant &item) {
+  std::optional<uint64_t> opt;
+  std::visit(
+      [&opt](const auto &data) {
+        if constexpr (!std::is_same<const rttt::Unknown &,
+                                    decltype(data)>::value) {
+          opt = data.time;
+        }
+      },
+      item);
+  return opt;
+}
+
+template <typename C, typename = void> struct has_url : std::false_type {};
+
+template <typename C>
+struct has_url<
+    C, typename std::enable_if<std::is_same<decltype(std::declval<C>().url),
+                                            std::string>::value>::type>
+    : std::true_type {};
+
+inline std::optional<std::string> try_url(const ItemVariant &item) {
+  std::optional<std::string> opt;
+  std::visit(
+      [&opt](const auto &data) {
+        if constexpr (has_url<decltype(data)>::value)
+          opt = data.url;
+      },
+      item);
+  return opt;
+}
+
+inline std::optional<std::string> try_text(const ItemVariant &item) {
+  std::optional<std::string> opt;
+  std::visit(
+      [&opt](const auto &data) {
+        if constexpr (!std::is_same<const rttt::Unknown &,
+                                    decltype(data)>::value) {
+          opt = data.text;
+        }
+      },
+      item);
+  return opt;
+}
+
+inline std::optional<std::vector<ItemVariant>>
+try_kids(const ItemVariant &item) {
+  if (std::holds_alternative<reddit::Story>(item)) {
+    return std::get<reddit::Story>(item).kids;
+  }
+  if (std::holds_alternative<reddit::Comment>(item)) {
+    return std::get<reddit::Comment>(item).kids;
+  }
+  if (std::holds_alternative<hackernews::Comment>(item) ||
+      std::holds_alternative<hackernews::Story>(item)) {
+    assert(false); // kids work different here, so not supported (yet?)
+  }
+
+  return std::nullopt;
+}
+
+template <typename C, typename = void> struct has_text : std::false_type {};
+
+template <typename C>
+struct has_text<
+    C, typename std::enable_if<std::is_same<decltype(std::declval<C>().text),
+                                            std::string>::value>::type>
+    : std::true_type {};
+
 } // namespace rttt
diff --git a/src/main.cpp b/src/main.cpp
index e68226c..45fa943 100644
--- a/src/main.cpp
+++ b/src/main.cpp
@@ -88,48 +88,43 @@ ui::WindowData switch_path(ui::WindowData &&current_window,
   return std::move(current_window);
 }
 
-// FIXME: begin is not relevant anymore now we use scrolling
-std::vector<rttt::ItemVariant> getItems(const rttt::Path &path, size_t begin,
-                                        size_t end) {
-  std::vector<rttt::ItemVariant> result;
-
-  if (path.type == rttt::SiteType::HN) {
-    auto &items = state_hn.items;
-    // FIXME: check whether path exists
-    // otherwise load default or start loading new one?
-    end = std::min(begin + state_hn.idsMap[path.name].size(), end);
-    for (size_t i = begin; i < end; ++i) {
-      const auto &id = state_hn.idsMap[path.name][i];
-      if (!items.contains(id)) {
-        items.insert({id, hackernews::Story()});
-      }
-
-      if (!items.contains(id) ||
-          std::holds_alternative<hackernews::Story>(items.at(id)) == false) {
-        if (state_ui.logSkipOnce) {
-          logger::push_back("HN: Skipping unsupported story " +
-                            std::to_string(id));
-          state_ui.logSkipOnce = false;
-        }
-        continue;
-      }
-
-      items.mark_active(id);
-      auto item = std::get<hackernews::Story>(items.at(id));
-      result.push_back(item);
+auto dispatch_drawing_stories(ui::WindowData &&window) {
+  if (window.path.type == rttt::SiteType::HN) {
+    auto result = hackernews::get_stories(std::move(state_hn.items),
+                                          state_hn.idsMap[window.path.name]);
+    state_hn.items = std::move(std::get<0>(result));
+    // In case highlighted_item is not cached anymore
+    auto maybe_id = try_id_string(window.scroll_state.highlighted_item);
+    if (maybe_id.has_value()) {
+      auto id = std::stoi(maybe_id.value());
+      if (!state_hn.items.contains(id))
+        window.scroll_state.highlighted_item = ItemVariant();
     }
-  } else if (path.type == rttt::SiteType::Reddit) {
-    if (reddit::items.contains(path.name)) {
-      const auto &items = reddit::items.at(path.name);
-      reddit::items.mark_active(path.name);
-      end = std::min(end, items.size());
-      for (auto i = begin; i < end; ++i)
-        result.push_back(items[i]);
+    window.scroll_state =
+        rttt::ui::drawItems(std::move(window.scroll_state), std::get<1>(result),
+                            window.path.mode, state_ui.item_view_states);
+
+  } else if (window.path.type == rttt::SiteType::Reddit) {
+    if (reddit::items.contains(window.path.name)) {
+      auto result = reddit::get_stories(std::move(reddit::items), window.path);
+      reddit::items = std::move(std::get<0>(result));
+      window.scroll_state = rttt::ui::drawItems(
+          std::move(window.scroll_state), std::get<1>(result), window.path.mode,
+          state_ui.item_view_states);
     }
-  } else if (path.type == rttt::SiteType::RSS) {
-    return rttt::rss::get_items(state_rss, path, end);
+  } else if (window.path.type == rttt::SiteType::RSS) {
+    // Technically this temporary is not needed, but compilation (type
+    // inference?) will fail without it
+    auto vec = rttt::rss::get_items(state_rss, window.path);
+    auto v = vec | std::views::transform(
+                       [](const auto &item) { return std::pair(0, item); });
+    window.scroll_state =
+        rttt::ui::drawItems(std::move(window.scroll_state), v, window.path.mode,
+                            state_ui.item_view_states);
+  } else {
+    assert(false);
   }
-  return result;
+  return window;
 }
 
 std::vector<std::pair<size_t, rttt::ItemVariant>> getComments(
@@ -258,28 +253,15 @@ change the number of panels/windows
 
     ImGui::Text("%s", "");
     if (window.path.mode != rttt::list_mode::comment) {
-      // TODO: replace this with std::views:transform, when ranges support
-      // matures somewhat
-      std::vector<std::pair<size_t, ItemVariant>> toShow;
-      for (auto &item : getItems(window.path, 0,
-                                 window.scroll_state.scroll_top +
-                                     ImGui::GetWindowSize().y)) {
-        toShow.push_back(std::pair(0, item));
-      }
-
-      // TODO make newWindows part of WindowData (call it scrollState)
-      window.scroll_state =
-          rttt::ui::drawItems(std::move(window.scroll_state), toShow,
-                              window.path.mode, state_ui.item_view_states);
-
+      window = dispatch_drawing_stories(std::move(window));
       if (ImGui::IsWindowFocused()) {
         auto &item = window.scroll_state.highlighted_item;
         if (ImGui::IsKeyReleased('l')) {
           if (window.path.type == SiteType::RSS) {
             if (std::holds_alternative<rss::Story>(item) &&
                 window.path.name == "/rss") {
-              // Need a copy here, std::get seems to invalidate the item in some
-              // way?
+              // Need a copy here, std::get seems to invalidate the item in
+              // some way?
               auto old_item = item;
               auto key = std::get<rss::Story>(item).key;
               auto new_path = window.path.name + "/" + key;
@@ -379,11 +361,12 @@ change the number of panels/windows
       storyUrl = uri;
     }
     auto maybe_text = try_text(window.scroll_state.highlighted_item);
-    if (maybe_text.has_value())
+    if (maybe_text.has_value()) {
       state_ui.showPopup = rttt::ui::drawUrlPopup(state_ui.showPopup, storyUrl,
                                                   maybe_text.value());
-    else
+    } else {
       assert(false);
+    }
 
     ImGui::End();
   }
@@ -465,8 +448,8 @@ change the number of panels/windows
     }
     ImGui::End();
   } else {
-    // These all need to be at the end, so textInput is true (or false) for all
-    // the above
+    // These all need to be at the end, so textInput is true (or false) for
+    // all the above
     if (ImGui::IsKeyPressed('s', false)) {
       state_ui.showStatusWindow = !state_ui.showStatusWindow;
     }
diff --git a/src/reddit.hpp b/src/reddit.hpp
index 2b43dba..ff43ca1 100644
--- a/src/reddit.hpp
+++ b/src/reddit.hpp
@@ -2,6 +2,7 @@
 
 #include <iostream>
 #include <queue>
+#include <ranges>
 #include <set>
 #include <string>
 
@@ -21,7 +22,7 @@ namespace rttt {
 namespace reddit {
 
 struct credentials {
-  cpr::Header header = cpr::Header{{"User-Agent", "rttt/0.8.0"}};
+  cpr::Header header = cpr::Header{{"User-Agent", "rttt/0.8.1"}};
   std::string client_id;
   std::string device_id;
   std::string refresh_token;
@@ -197,7 +198,7 @@ inline std::string getURIFromPath(const rttt::Path &path) {
     basename = "";
   if (path.mode == rttt::list_mode::comment)
     return base_path + basename + "/comments/" + path.id;
-  return base_path + basename + "/hot";
+  return base_path + basename + "/best";
 }
 
 inline void requestURI(const rttt::Path &path) {
@@ -309,7 +310,7 @@ bool update(state &state) {
   };
 
   auto error_handling = [&state](int status_code) {
-    if (status_code == 401 || status_code == 403) {
+    if (status_code == 401) {
       auto config = config::load();
       state = retrieve_access_token(std::move(state), config);
       logger::push_back("Reddit: requesting new access token");
@@ -361,5 +362,14 @@ inline rttt::Path updatePath(rttt::Path &&path) {
   items.mark_active(path.name);
   return std::move(path);
 }
+
+template <typename T> auto get_stories(T &&items, const rttt::Path &path) {
+  typename T::mapped_value vec;
+  items.mark_active(path.name);
+  auto v =  items.at(path.name) | std::views::transform([](const auto &item) {
+           return std::pair(0, item);
+         });
+  return std::pair(std::move(items), v);
+}
 } // namespace reddit
 } // namespace rttt
diff --git a/src/rss.hpp b/src/rss.hpp
index 49f3cd3..f5e7e67 100644
--- a/src/rss.hpp
+++ b/src/rss.hpp
@@ -13,6 +13,13 @@
 namespace rttt {
 namespace rss {
 
+auto compare_stories = [](const rss::Story &a, const rss::Story &b) {
+  if (a.time == b.time)
+    return (a.id > b.id);
+  return (a.time > b.time);
+};
+using story_set = std::set<rss::Story, decltype(compare_stories)>;
+
 inline std::map<std::string, std::string> parse_opml(pugi::xml_node doc) {
   std::map<std::string, std::string> urls;
   for (auto &child : doc.children()) {
@@ -89,19 +96,36 @@ inline std::tm parse_pubdate(std::string time_string) {
   return time;
 }
 
-inline std::vector<rss::Story> parse_atom(const pugi::xml_document &doc,
-                                          const std::string &key) {
+std::string find_best_link(const pugi::xml_node &entry) {
+  std::string url = {};
+  for (auto lnk = entry.child("link"); lnk; lnk = lnk.next_sibling("link")) {
+    if (url.empty()) {
+      if (lnk.attribute("href")) {
+        url = lnk.attribute("href").value();
+      } else {
+        url = lnk.text().get();
+      }
+    } else if (lnk.attribute("rel") &&
+               std::string("alternate") == lnk.attribute("rel").value()) {
+      url = lnk.attribute("href").value();
+    }
+  }
+  return url;
+}
+
+rss::story_set parse_atom(const pugi::xml_document &doc,
+                          const std::string &key) {
   auto feed = doc.child("feed");
   auto by = feed.child("author").child("name").text().get();
   auto blog = feed.child("title").text().get();
   // Parse all the entries, get url title, by, time and text
-  std::vector<rss::Story> stories;
+  rss::story_set stories;
   for (pugi::xml_node entry = feed.child("entry"); entry;
        entry = entry.next_sibling("entry")) {
     rss::Story story;
     story.title = entry.child("title").text().get();
     story.text = rttt::parse_html(entry.child("content").text().get());
-    story.url = entry.child("link").text().get();
+    story.url = find_best_link(entry);
     story.by = by;
     story.domain = blog;
     auto time_tm = parse_time(entry.child("updated").text().get());
@@ -109,13 +133,13 @@ inline std::vector<rss::Story> parse_atom(const pugi::xml_document &doc,
     story.id = std::hash<std::string>{}(key + story.title +
                                         std::to_string(story.time));
     story.key = key;
-    stories.push_back(story);
+    stories.insert(story);
   }
   return stories;
 }
 
-inline std::vector<rss::Story> parse_rss(const pugi::xml_document &doc,
-                                         const std::string &key) {
+rss::story_set parse_rss(const pugi::xml_document &doc,
+                         const std::string &key) {
   auto channel = doc.child("rss").child("channel");
   auto feed = doc.child("rss").child("channel");
   if (!channel) {
@@ -124,13 +148,13 @@ inline std::vector<rss::Story> parse_rss(const pugi::xml_document &doc,
   }
   auto blog = channel.child("title").text().get();
   // Parse all the entries, get url title, by, time and text
-  std::vector<rss::Story> stories;
+  rss::story_set stories;
   for (pugi::xml_node entry = feed.child("item"); entry;
        entry = entry.next_sibling("item")) {
     rss::Story story;
     story.title = entry.child("title").text().get();
     story.text = rttt::parse_html(entry.child("description").text().get());
-    story.url = entry.child("link").text().get();
+    story.url = find_best_link(entry);
     story.by = entry.child("dc:creator").text().get();
     story.domain = blog;
     if (entry.child("pubDate")) {
@@ -145,7 +169,7 @@ inline std::vector<rss::Story> parse_rss(const pugi::xml_document &doc,
     story.id = std::hash<std::string>{}(key + story.title +
                                         std::to_string(story.time));
     story.key = key;
-    stories.push_back(story);
+    stories.insert(story);
   }
   return stories;
 }
@@ -154,8 +178,8 @@ inline bool is_rss(const pugi::xml_document &doc) {
   return doc.child("rss") || doc.child("rdf:RDF");
 }
 
-inline std::vector<rss::Story> parse_feed(const std::string &content,
-                                          const std::string &key) {
+inline rss::story_set parse_feed(const std::string &content,
+                                 const std::string &key) {
   pugi::xml_document doc;
   pugi::xml_parse_result result = doc.load_string(content.c_str());
   if (!result)
@@ -169,10 +193,9 @@ inline std::vector<rss::Story> parse_feed(const std::string &content,
 struct state {
   std::map<std::string, std::string> feed_urls;
   request::State<std::string> requests;
-  rttt::active_storage<std::string, std::vector<rss::Story>> items;
+  rttt::active_storage<std::string, rss::story_set> items;
 
-  std::vector<rttt::ItemVariant> item_cache;
-  bool is_sorted = false;
+  rss::story_set item_cache;
 
   state() = default;
 
@@ -183,8 +206,7 @@ struct state {
       logger::push_back("RSS: unable to load opml file");
     feed_urls = parse_opml(doc);
 
-    items = rttt::active_storage<std::string, std::vector<rss::Story>>(15 * 60,
-                                                                       60, 0);
+    items = rttt::active_storage<std::string, rss::story_set>(15 * 60, 60, 0);
     for (auto &pair : feed_urls)
       items.insert({pair.first, {}});
   }
@@ -213,44 +235,12 @@ struct state {
         continue;
       }
 
-      // TODO: We rely on items being in descending order. We could consider
-      // using a container which enforces this
-      if (v.size() > 1 && try_time(v[0]).value() < try_time(v[1]).value()) {
-        logger::push_back("Ordering issue: " + key);
-        std::sort(v.begin(), v.end(), [](auto a, auto b) {
-          assert(rttt::try_time(a));
-          assert(rttt::try_time(b));
-          return rttt::try_time(a).value() > rttt::try_time(b).value();
-        });
-      }
-
-      // Check if we have previous entries. If not then insert into the items
-      // cache. Else get the new entries and insert them at the top of the cache
-      // FIXME: If more than one feed has an update in this time, this could
-      // result in those entries being out of order. Would be nice to sort them
-      // first?
       assert(this->items.contains(key));
-      if (this->items.at(key).empty()) {
-        this->item_cache.insert(this->item_cache.end(), v.begin(), v.end());
-        this->is_sorted = false;
-      } else {
-        auto maybe_time = rttt::try_time(this->items.at(key).front());
-        assert(maybe_time);
-        size_t n_new = 0;
-        auto maybe_time2 = rttt::try_time(v[n_new]);
-        assert(maybe_time2);
-        while (maybe_time2.value() > maybe_time.value()) {
-          ++n_new;
-          assert(n_new < v.size());
-          maybe_time2 = rttt::try_time(v[n_new]);
-          assert(maybe_time2);
-        }
-        if (n_new > 0) {
-          // Note to self. Don't resize v here, because we need to copy/move all
-          // into items below.
-          this->item_cache.insert(this->item_cache.begin(), v.begin(),
-                                  v.begin() + n_new);
-        }
+      for (auto &&item : v) {
+        auto pair = item_cache.insert(item);
+        // All other items seem to already exist
+        if (!pair.second)
+          break;
       }
       this->items.at(key) = std::move(v);
       maybe_receive = request::try_pop_and_retrieve(requests);
@@ -267,34 +257,13 @@ std::set<std::string> get_known_paths(const rss::state &state) {
   return set;
 }
 
-inline std::vector<ItemVariant> get_items(rss::state &state,
-                                          const rttt::Path &path, size_t size) {
+auto get_items(const rss::state &state, const rttt::Path &path) {
+  // FIXME: use a view here or can we just return sets of stories directly?
   if (path.mode == rttt::list_mode::feed && !empty(path.id)) {
-    std::vector<ItemVariant> v;
     assert(state.items.contains(path.id));
-    auto &u = state.items.at(path.id);
-    v.insert(v.end(), u.begin(), u.end());
-    return v;
-  }
-  if (!state.is_sorted) {
-    std::sort(state.item_cache.begin(), state.item_cache.end(),
-              [](auto a, auto b) {
-                assert(rttt::try_time(a));
-                assert(rttt::try_time(b));
-                return rttt::try_time(a).value() > rttt::try_time(b).value();
-              });
-    state.is_sorted = true;
+    return state.items.at(path.id);
   }
-  // FIXME clean up this double check that resizing keeps the old size_t
-  // intact
-  auto old_size = state.item_cache.size();
-  auto v = state.item_cache;
-  // Shouldn't we just limit the items printed by the ui in drawItems (since too
-  // many results slow down the UI)
-  if (size < old_size)
-    v.resize(size);
-  assert(state.item_cache.size() == old_size);
-  return v;
+  return state.item_cache;
 }
 
 } // namespace rss
diff --git a/src/rttt.hpp b/src/rttt.hpp
index 40a4b91..9f16914 100644
--- a/src/rttt.hpp
+++ b/src/rttt.hpp
@@ -120,8 +120,9 @@ inline std::vector<std::string> extractURL(std::string text) {
   // ([^/_[:alnum:]]) If this turns out to be false then we could also try to
   // filter out trailing punctuation marks etc specifically ([);:.!?] std::regex
   // e("(https?://[A-z0-9$–_.+!*‘(),./?=]+?)[),;:.!?]*(\\s|$)");
+  // "(https?://[[:alnum:]$-_.+!*‘(),./?=;&#]+?)[^/_[:alnum:]]*(\\s|$)");
   std::regex e(
-      "(https?://[[:alnum:]$-_.+!*‘(),./?=;&]+?)[^/_[:alnum:]]*(\\s|$)");
+      "(https?://[[:alnum:]$-_+!*‘,/?=;&#]+?)(\\]\\(|[^/_[:alnum:]]*(\\s|$))");
   std::smatch sm;
   while (std::regex_search(text, sm, e)) {
     if (sm.size() > 1)
@@ -173,94 +174,6 @@ inline Path parsePath(const std::string &path_name) {
   return path;
 }
 
-inline std::optional<std::string> try_id_string(const ItemVariant &item) {
-  if (std::holds_alternative<reddit::Story>(item)) {
-    return std::get<reddit::Story>(item).id;
-  }
-  if (std::holds_alternative<reddit::Comment>(item)) {
-    return std::get<reddit::Comment>(item).id;
-  }
-  if (std::holds_alternative<hackernews::Story>(item)) {
-    return std::to_string(std::get<hackernews::Story>(item).id);
-  }
-  if (std::holds_alternative<hackernews::Comment>(item)) {
-    return std::to_string(std::get<hackernews::Comment>(item).id);
-  }
-  if (std::holds_alternative<rss::Story>(item)) {
-
author	Edwin van Leeuwen <edwinvanl@tuta.io>	2022-02-26 09:35:20 +0000
committer	Edwin van Leeuwen <edwinvanl@tuta.io>	2022-02-26 09:35:20 +0000
commit	83f5a35a3330c0c2d82853186e369c22d963d357 (patch)
tree	d7b03fde7361c8a38d67923eaa40816509c2cc67
parent	d69f1b8f154a42f9b440b8d0885b9559e54c6753 (diff)
parent	024e3e066c3dd7cd46023dde42237768ad2f46fe (diff)