ohvbd 1.0.0
First CRAN release
Major API change
-
extract_functions are nowglean_.- This means that if
tidyverseis loaded afterohvbd, there are no direct namespace collisions.
- This means that if
Full list of function name changes:
-
extract()->glean() -
extract_ad()->glean_ad() -
extract_gbif()->glean_gbif() -
extract_vd()->glean_vd() -
extract_vt()->glean_vt() -
fetch_extract_vd_chunked()->fetch_glean_vd_chunked() -
fetch_extract_vt_chunked()->fetch_glean_vt_chunked()
New functions & arguments:
-
ohvbdnow interfaces with GBIF for occurrence data.- New
*_gbiffunctions (e.g.fetch_gbif()) allow for retrieving and extracting data from GBIF. - A GBIF account and the
rgbifpackage are required to retrieve data from GBIF. - The account details must also be set up as shown in the rgbif documentation.
- New
- New
tee()command allows one to extract data from the middle of a pipeline and save it to an environment.- This is definitely not only useful for
ohvbdworkflows, and can be used in any base R pipeline (|>). It has not been tested in magrittr pipelines but should work as-is.
- This is definitely not only useful for
- New
filter_db()command allows for filtering out of only one database’s results from hub searches. -
check_db_status()now returns (invisibly) whether all databases are up or not. - New
fetch_citation()andfetch_citation_*commands provide an interface to attempt to retrieve citations from a vectorbyte dataset.- This will (by default) possibly redownload parts or all of the data if the columns are not currently present.
- New
force_db()function enables one to forceohvbdto consider a particular object as having a particular provenance. - New
simplifyargument tosearch_hub()makes hub searches return anohvbd.idsobject if only one database was searched for. This behaviour is on by default.- To match this,
filter_db()will now transparently returnohvbd.idsobjects if it gets them.
- To match this,
- New
taxonomyargument tosearch_hub()allows for filtering searches by GBIF backbone IDs. - New
match_species()function allows for quick and flexible matching of species names to their GBIF backbone IDs. - New
match_country()function allows for matching of country names to WKT polygons via naturalearth. - New
ohvbd_db(),has_db(), andis_from()functions allow for quick testing of object provenance (according toohvbd). - New
get_default_ohvbd_cache()function allows for custom functions that interface with cachedohvbddata files. - New
list_ohvbd_cache()andclean_ohvbd_cache()functions enable better interactive cache management.- As a result,
clean_ad_cache()has been removed as it is now unnecessary.
- As a result,
-
search_x_smart()functions can now take"tags"as a search field, enabling support for tagged datasets.
Other:
- Entire code base is now continuously formatted using Air v0.7.1.
- Examples are no longer wrapped in
\dontrun{}so they should be runnable from an installed version of the package. - A good chunk of the functional logic of
ohvbdis now covered with unit tests (using thevcrpackage). -
fetch_vd()no longer tries to retrieve ids with no pages of data. - Functions that interface with vectorbyte databases no longer recommend using
set_ohvbd_compat()as unexpected SSL errors should break pipelines by default.- These errors are no longer expected to occur when interfacing with vectorbyte.
- Running
fetch()on anohvbd.hub.searchorglean()on anohvbd.idsobject now provides a hint that you may have forgotten something.- Occasionally users would use forget a
fetch()command and runsearch_hub() |> glean()which didn’t previously give an interpretable error.
- Occasionally users would use forget a
- Vignettes now use
vcrto massively reduce their build time. This should only matter to developers ofohvbd, or users who download from github and build the vignettes themselves. -
ohvbd.ids()now warns you and fixes the problem if you provide ids with duplicate values. -
glean_vt()andglean_vd()now force the inclusion of the dataset ID when filtering columns (using thecolsargument).- This is intended to encourage you to preserve at least one means of retrieving citation data later.
- WKT parsing and formatting is now significantly more robust.
- Cached AREAData now includes the cache timestamp as an attribute rather than a separate variable in the cache file.
-
glean_ad()now correctly returns a matrix even when there is only 1 row or column. - gadm spatial files are now cached as GeoPackage rather than shapefiles, leading to a >50% speedup in loading! (Thanks to @josiah.rs on bluesky for the suggestion!)
-
fetch_vd_counts()is now significantly faster, more robust, and temporarily caches data.- You will see particular improvements if you are trying to retrieve more than about 10 ids in one go or if you are repeatedly running the same download code in the same day.
- This speedup also applies to
fetch_vd()under the hood, particularly if you are running it multiple times in a day.
- Explicit term checking (such as in
fetch_ad()for metrics andsearch_vt_smart()for operators and fields) is now fuzzy, allowing for a small amount of deviation from the actual term name. -
assoc_ad()now tries to guess LatLong column names if none (or the wrong ones) are provided. - Errors in internal functions now make it more clear which user-facing functions they originate from.
- Multiple functions now default to
NULLrather thanNAfor default missing values (except date arguments to AD-related functions, where NA is more reasonable in the grand scheme). -
fetch_ad()now caches and tries to read from cache by default.- Generally speaking unless exceedingly up-to-date data is required, this will be the best for most people.
- If you do require guaranteed new data, it’s worth setting
refresh_cache = TRUEoruse_cache = FALSE(depending on if you want to replace your existing cache or not).
- All downloaders that can potentially cache data also attach the download time if not loading from cache.
ohvbd 0.6.1
- New
search_hub()function enables searching across multiple databases at once via vbdhub.- This includes new functionality for specifying searches using spatial extent polygons and generally more intelligent search behaviour.
- New function
generate_vt_template()which quickly generates a VecTraits template for later upload.
ohvbd 0.6.0
- Internally
ohvbdnow only uses base R pipes (|>). - The magrittr pipe (
%>%) is no longer used internally, nor is it exported for use. -
httr2v1.1.1 deprecated thepoolargument ofreq_perform_parallel()which brokefetch()commands acrossohvbd.- These have now been rewritten using the new
max_activeargument, which does simplify everything a bit. - This change does bump the required version of
httr2to be v1.1.1.
- These have now been rewritten using the new
ohvbd 0.5.2
-
fetch_ad()now searches for and retrieves the most up-to-date GID2 files from AREAdata. - New
timeoutparameter offetch_ad()to control timeouts of AD downloads. Defaults to 4 minutes. -
assoc_ad()now correctly extracts data (this functionality regressed in 0.5.0 as a consequence of the new dynamic method dispatch approach to data retrieval). -
assoc_ad()also gives now consistent output even when a 1-dimensional output is returned fromextract_ad() - All
fetch_functions now have a defaultconnectionsargument of 2, leading to faster retrieval across the board. -
check_srcargument has been removed from all functions. It no longer serves much of a purpose due to the sanity checking changes implemented in 0.5.0.
ohvbd 0.5.1
-
fetch_vd()now correctly returns all data from datasets over 50 rows. -
fetch_vd()also now tells you how much data you are retrieving and a coarse estimate of how long this will take. - New function
fetch_vd_counts()allows for quick checking of dataset sizes. This is very important as some datasets in VecDyn are over 40,000 rows long! - All
fetch_functions (and thus alsofetch()) now use parallel data retrieval, even when only 1 connection is used. This seems to lead to a 20% gain in download speed for no cost.
ohvbd 0.5.0
Major API change
-
get_functions have been split into two new types of function, based upon exact usage.-
find_functions retrieve metadata such as column definitions and ids. -
fetch_functions retrieve actual datasets.
-
- New set of S3 classes (
ohvbd.ids,ohvbd.responses,ohvbd.data.frame,ohvbd.ad.matrix) to allow for nicer checks of data integrity.- This has the side effect of no longer falsely triggering the data continuity checks of
fetch_functions when indexing the output offind_x_ids()functions.
- This has the side effect of no longer falsely triggering the data continuity checks of
- New convenience functions
fetch()andextract()leverage dynamic method dispatch along with the above classes to infer the correct underlyingfetch_andextract_functions to use.- As such you can now write code such as
find_vt_ids() |> fetch() |> extract()without having to remember the correct extractor to use. - You can still use the specific extractor functions as before should you desire.
- As such you can now write code such as
- All major functions interfacing with AD, VD, and VT output one of these classes.
- Cached data from AD now contains an attribute to signify that it is cached.
- New classes are subclassed from other base R classes, and so mostly behave in the same way (i.e. you can subset an
ohvbd.data.framein the same way as just subsetting a normal df). - New function
ohvbd.ids()allows users to create objects of the same S3 class as output by thefind_andsearch_functions. - New
is_cached()function enables a simple check to see if an object has been loaded from the cache byohvbd.
Full list of function name changes:
-
get_ad()->fetch_ad() -
get_extract_vd_chunked()->fetch_extract_vd_chunked() -
get_extract_vt_chunked()->fetch_extract_vt_chunked() -
get_gadm_sfs()->fetch_gadm_sfs() -
get_vd()->fetch_vd() -
get_vt()->fetch_vt() -
get_vd_columns()->find_vd_columns() -
get_vd_current_ids()->find_vd_ids() -
get_vt_current_ids()->find_vt_ids()
ohvbd 0.4.4
- New function
check_ohvbd_config()allows easy printing of the current status of ohvbd’s options. - New
clean_ad_cache()function enables users to clean their cached AREAdata files easily. - Build timings now appear in all vignettes.
- Cli outputs are now suppressed when running vignettes in non-interactive mode (e.g. while knitting).
- Default cache path is now in the user directory (obtained from
tools::R_user_dir()). -
use-areadatavignette now has part of its content complete. - Generally this update is setting the stage for another major API overhaul in 0.5.0.
ohvbd 0.4.3
- Large changes to all vb
get_andsearch_function error handling - All of these functions now check automatically for SSL issues, and recommend
set_ohvbd_compat()if these are detected. - All
get_calls requesting more than 10 ids run a pre-flight ssl check before attempting the whole thing. -
get_vd()andget_vt()now also return a list of ids that were missing and any curl errors that were found in the process of trying to get data.
ohvbd 0.4.2
-
set_ohvbd_compat()now asks for user confirmation in interactive mode. This makes running on linux a little annoying, but is worth it due to the seriousness of disabling SSL identity verification. - This is not asked if the R session is running in batch mode, under knittr, or under testthat.
-
retrieving-datavignette now only enables compatibility mode if running under linux. Generally it is best to keep package usage ofset_ohvbd_compat()to an absolute minimum. - Copyright holder now listed in DESCRIPTION
ohvbd 0.4.1
- New parallel downloading options for
get_x()andget_extract_x()functions. - These are to be used with caution, as they put significantly more load on the server than a sequential run would.
- New argument
check_srcallows for toggling of id-sanity checking for most functions. -
retrieving-datavignette now contains instructions for the use ofsearch_x_smart().
ohvbd 0.4.0
Major API change
- Major simplification of function names!
-
get_x_byid()->get_x() -
extract_x_data()->extract_x() -
assoc_x_y()->assoc_x() -
get_extract_x_byid_chunked()->get_extract_x_chunked() - This breaks ALL PREVIOUS CODE!
- Naming now follows a logical scheme of
verb_target_modifier(). - For example
get_x_y()functions always retrieve data from databasexwithyspecifying any special type of data. - Similarly
extract_x()functions always extract data. - If a function does multiple things, it may get multiple verbs separated by underscores, e.g.
get_extract_x_chunked() - Pipelines now internally attempt to confirm data integrity by checking that the correct functions are piped together.
- This means it is no longer easy to accidentally do something like
get_vd_current_ids() |> get_vt().
ohvbd 0.3.1
- New function
format_time_overlap_bar()allows for visually formatting a range of dates combined with another set of target dates to see where overlaps do or do not take place. - This is mostly used in the error handling of
extract_ad()however it can also be used independently. It was designed to fill a more general role within UI design using the cli package, and should be usable (or hackable) by others needing the same tool. -
extract_ad()now errors when alltargetdateentries are outside of the range of the AREAdata dataset. - New
assoc_ad()associates arbitrary data including lon/lat columns with AREAdata. - New
get_vd_columns()provides quick reference about the currently present VecDyn columns. (This is currently not possible for VecTraits, but the feasibility is being investigated.) - New
assoc_gadm()function associates gadm ids at all spatial scales with arbitrary data that include lon/lat columns. - Documentation now correctly displays favicons.
- Logo now rotates through a variety of colourschemes according to the version number.
ohvbd 0.3.0
Major API change
-
*_basereq()calls are no longer required as the first argument for functions. - As such, data downloads no longer need to start with
vb_basereq() |>. - Basereq can now be overridden by providing an alternative basereq to the
basereqargument of these functions, which can be generated usingvb_basereq(). - This is usually only needed if using the argument
unsafe = TRUEforvb_basereq(). - It is also possible to set ohvbd to use compatability-mode ssl calls using
set_ohvbd_compat(). - This change breaks any code written prior to this version, and so major rewrites may be required.
ohvbd 0.2.5
-
extract_ad()now allowstargetdateto be specified as a vector of full dates, e.g.c("2023-08-04", "2023-09-21").
ohvbd 0.2.4
- VecTraits & VecDyn search functions no longer return stale responses if the search fails.
ohvbd 0.2.3
- VecTraits functions now use the
clipackage to provide a nicer cli interface. - VecDyn functions now use the
clipackage to provide a nicer cli interface.
ohvbd 0.2.2
- AREAdata functions now use the
clipackage to provide a nicer cli interface. -
retrieving-datavignette now builds significantly quicker.
ohvbd 0.2.1
-
get_ad()now caches data from AREAdata to reduce extraneous data downloading and speed up re-execution and development. - This requires the new argument
use_cache=TRUEand caches by default in the user directory.
ohvbd 0.2.0
- Install instructions now include the correct command for building vignettes when installing from GitHub.
- New
check_db_status()allows for easy checking of the online status of various data providers. -
ohvbdnow interfaces with the AREAdata repository for historical climate data. - This includes functions to get and filter AREAdata datasets at different spatial scales.
ohvbd 0.1.4
- Small framing change in
retrieving-datavignette (courtesy of @willpearse).
ohvbd 0.1.3
- New
retrieving-datavignette to explain the basic process of downloading and extracting data from Vectraits and VecDyn.
ohvbd 0.1.2
- New
search_vd()andsearch_vd_smart()now allow for searching of VecDyn in the same manner as for VecTraits.
ohvbd 0.1.1
-
get_vb_basereq()renamed tovb_basereq()for ease of writing.
ohvbd 0.1.0
- Chunked retrieval now correctly uses the chunksize argument.
-
ohvbdnow interfaces with the VecDyn database for vector population dynamic data. - This includes functions to get ids, get datasets, and to extract data from the responses.
- These functions use the same api structure as for VecTraits data download, but with
vdreplacingvtin the function names (e.g.get_vd())
ohvbd 0.0.5
- New
search_vt()allows for keyword-based searching of VecTraits. -
get_vt_current_ids()now handles 404 responses gracefully.
ohvbd 0.0.4
- New
search_vt_smart()allows for field-based searching of VecTraits.
ohvbd 0.0.3
-
get_vt()now leverageshttr2::req_perform_sequential()for more efficient dataset retrieval.