You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This started from Renée DiResta's Lawfare piece — "Fewer Bots, More
Ads: The Pentagon's Evolving Online Influence Campaigns"
— which identifies GDIT as a verified advertiser running a network of
covert-attribution websites (the "gc_" prefix sites) but doesn't
identify the federal contract the work is running under. I wanted to
see what additional context the public procurement record could add.
About ~150 API calls across SWMS IDV drilldown, OTA sweep, subaward
chains, and IT Dashboard sweeps, on tango-python v0.5.0, medium
tier.
What I was trying to find
The Lawfare piece names a prime contractor (GDIT) and a class of
activity (covert ad-buys for a network of "gc_" sites targeting
foreign-language audiences in CENTCOM / SOUTHCOM / AFRICOM / INDOPACOM
AORs) but does not name the specific federal contract paying for the
work. So the question was: given a known prime and a known class of
work but no contract name, can the procurement record narrow down
the vehicle? That turned into six concrete data needs.
For each need I tried USASpending and SAM.gov first, then Tango. The
specific things Tango returned that the others didn't:
1. Find a named IDIQ family that's not in USASpending's keyword index
The trace's original §4 needed to identify what the GBPS → SWMS
"bridge contracts" of 2016-17 were bridging to. SWMS is a real
SOCOM IDIQ family, but it's invisible to USASpending and SAM.gov
keyword search:
Tango list_idvs(search="SWMS") → 25 results, including the
three Group A IDIQs, the sixteen Group B IDIQs, and a Group C
IDV (H9222217D0004, St. Michael's, $150M ceiling).
That's the headline source-coverage gap.
2. Parent-IDV → task-order traversal as one query pattern
USASpending has the IDV detail page (e.g. /award/CONT_IDV_H9222216D0040_9700/) that lists child orders in
a paginated UI, but spending_by_award doesn't accept an idv_piid filter — you can't grab every TO under an IDV via API
in one call.
SAM.gov doesn't have this at all (solicitation index, not awards).
Tango: list_idvs(piid=...) → list_idv_awards(key=..., limit=100, cursor=...). Walked 19 of the 20 SWMS IDV PIIDs I had and pulled
123 child task orders in one script, ~40 API calls. (The 20th
was the Group C St. Michael's IDV, which I'd missed because I
didn't have its PIID — Tango's list_idvs(search="SWMS") from
need chore: go live #1 above would have caught it, lesson learned for me.)
3. Subaward chains per prime contract, retrievable by PIID
USASpending has spending_by_subaward, but I didn't do a
systematic per-PIID cross-check between USASpending and Tango on
every GDIT prime in this trace, so I can't precisely characterize
the relative completeness here. Anecdotally, USASpending's
subaward feed is sparse for many older / smaller contracts; both
sources show zero subs on the same GBPS-era contracts.
SAM.gov: no subaward data.
Tango: list_subawards(award_key=...) returned a usable vendor
stack on the two flagship GDIT primes — 61 subaward rows on WSP
(47QFCA19F0035) and 43 on TSS (47QFCA24F0007) — which is what
let me characterize each prime by capability composition.
4. OTA / OTIDV index, with keyword search
USASpending: spending_by_award returns zero matches for W15QKN1791011, a $48M federal OTA whose description explicitly
says "support the US Government's shaping and influence
efforts." It's there in the underlying federal record but not
indexed in the contract-keyword endpoint.
SAM.gov: solicitation-side only, doesn't help for awarded OTAs.
Tango: list_otas(search="influence") surfaces it (along with
~50 other relevant OTAs / OTIDVs).
5. IT Dashboard cross-reference
GSA does have a public "IT Collect" API behind itdashboard.gov
(docs, schema), API-key gated.
I didn't try it directly — for this investigation Tango's list_itdashboard_investments(search=...) made it a one-line
lookup with the same shape syntax as the rest of the SDK.
Confirmed TRWI, AWIP, CCMD, WebOps, PSYOP all return
zero DoD title matches — informative negative.
6. Cross-source comparison
This isn't a Tango feature per se, but a workflow: for each
candidate finding, cross-check the same PIID on USASpending.
Tango's stable response shape made the comparison clean — key, piid, recipient.uei always present and consistent,
so the diff "Tango sees this, USASpending doesn't" was easy to
detect programmatically.
Repo layout (anonymized for sharing):
scripts/
01_swms_drilldown.py # need #2: list_idvs(piid=) → list_idv_awards(key=)
02_socom_otas.py # need #4: list_otas / list_otidvs sweeps
03_subawards.py # need #3: list_subawards per award_key
04_itdashboard.py # need #5: list_itdashboard_investments keyword sweep
data/ # raw JSON output, committed for reproducibility
What I'd want preserved in future versions
A few things that worked and would be worth not breaking in future
SDK versions:
list_idvs(piid=...) → list_idv_awards(key=...) traversal — the
cursor pagination is consistent across both endpoints and the
shape syntax carries through.
Nested-field shape syntax (recipient(display_name,uei), awarding_office(*)) on contract / IDV endpoints.
TangoRateLimitError as a separate exception class — made the
retry helper trivial. (TangoNotFoundError and TangoAPIError
are also useful.)
Friction we hit (ranked by impact)
1. ContractOrIDVCompetition is referenced as a nested model but isn't in EXPLICIT_SCHEMAS
Expected: request succeeds with the named subfields populated.
Actual:
tango.exceptions.ShapeValidationError: Field 'extent_competed' does not exist in
ContractOrIDVCompetition. No valid fields found in model schema.
Root cause: in tango/shapes/explicit_schemas.py, both CONTRACT_SCHEMA["competition"] and IDV_SCHEMA["competition"] set nested_model="ContractOrIDVCompetition", but EXPLICIT_SCHEMAS only
registers "Competition": COMPETITION_SCHEMA. The ContractOrIDVCompetition dataclass exists in models.py with the
right fields (its docstring even says "alias for Competition"), but
the explicit-schema parser doesn't know about it.
Workaround: use competition(*) — returns the same data without
field-level selection.
Possible fix: either (a) add "ContractOrIDVCompetition": COMPETITION_SCHEMA
to EXPLICIT_SCHEMAS, since the two are described as aliases, or (b)
point the nested_model references at "Competition" to match the
registered key.
2. Server-side shape-validation errors don't name the bad field
When the client-side parser catches a bad field (issue #1), the error
is great — it names the field, names the model, even suggests
alternatives. When the server-side validator catches one (e.g. a
field that doesn't exist on the resource at all), the error is just:
No field name. We had to bisect the shape to find which field was
rejected. Would be great if the server's error matched the client
parser's quality.
3. OTA / OTIDV response shapes are much thinner than Contract / IDV
OTA_SCHEMA and OTIDV_SCHEMA expose only key, piid, award_date, description, total_contract_value, obligated, recipient. No awarding_office, no place_of_performance, no psc_code / naics_code, no competition.
Meanwhile list_otas()acceptsawarding_agency, funding_agency, psc, naics, pop_*, awarding_office etc. as filter parameters —
so the API clearly has those fields. They're just not in the response
shape.
This made triaging OTA keyword-search results awkward — we wanted to
filter the 50 keyword-matched OTAs to "awarded by H92… (SOCOM)" but
had to fall back to PIID-prefix matching because there was no awarding_office field on the record.
Bringing OTA / OTIDV response shapes up to parity with Contract / IDV
(awarding_office, place_of_performance, psc_code, naics_code,
optional competition) would close this gap.
4. list_subawards shape rejects amount and award_date
Per the comment in tango/models.py:721:
# Note: API does not accept "id" or "amount" in shape (unknown_field).# Use only accepted fields.
This is a real friction point for any quantitative subaward analysis:
to attach a dollar value or date to each subaward row you have to
follow the parent award_key back to list_contracts(). Two API
calls when one should do.
If amount and award_date are columns on the underlying subaward
table, exposing them on the shape would dramatically reduce
round-trips. (And if they aren't — i.e. subaward dollars aren't
tracked in Tango — that's worth saying in the SDK docstring so users
don't waste time trying.)
5. list_subawards pagination silently caps at ~5000 rows
list_subawards uses page/limit pagination (every other list endpoint
we touched uses cursor pagination) and hits a ceiling around 50 pages
× 100 rows = 5,000 rows per query. On a large prime UEI like Booz
Allen's JCBMLGPE6Z71, that truncates before the full list is
delivered. No indication in the response that there are more rows
you didn't get.
Either:
Bump pagination to keyset like list_contracts / list_idvs / list_otas already use, or
Add a truncation flag / warning header to the response when the cap
is hit, or
Document the cap prominently in the SDK docstring so users know to
scope their queries before pulling.
The current behavior fails silently, which made one of our analyses
inflate "inter-entity link counts" until we noticed and re-scoped (see
"methodology lesson" below).
6. 43 of 50 keyword-search OTAs return recipient: null and award_date: null
For most OTAs surfaced via list_otas(search=...), both recipient
and award_date came back null. Calling get_ota(key) on the same
keys returned the same nulls — so this isn't a list-vs-detail
truncation; it's a data gap in the source.
Notable example: OT_AWD_W15QKN1791011_9700_-NONE-_-NONE- is a $48M
OTA whose description explicitly says "support the US Government's
shaping and influence efforts." No recipient, no award date. Same
PIID returns zero matches on USASpending's spending_by_award
endpoint, so Tango is uniquely surfacing it but can't attribute it.
For comparison, OTAs returned by awarding_agency="USSOCOM" all had
recipient + date populated. So the data-density gap seems specific to
the search-indexed records.
May not be Tango's fault directly, but a note in the OTA docs about
which records are likely to have null attribution fields (and why)
would help users know whether to retry / cross-reference vs accept
the null.
7. awarding_agency parameter accepts forms inconsistently
list_otas(awarding_agency="USSOCOM") # → 6 resultslist_otas(awarding_agency="SOCOM") # → same 6list_otas(awarding_agency="U.S. Special Operations Command") # → 0list_otas(awarding_agency="SPECIAL OPERATIONS COMMAND") # → 0
There's no documented canonical form per endpoint. A get_agency_choices() style helper, or even just a documented list of
accepted strings, would help. (CourtListener has a similar get_choices MCP tool, FWIW — works really well for discovering
accepted enum values.)
8. Exposing awarding-office filter would help (related to #3)
The PIID prefix is the best available proxy for awarding office (e.g. H92240 = HQ USSOCOM, H92277 = direct USSOCOM contracting at
MacDill, HR0011 = DARPA), but having to filter by PIID prefix in
post-processing is much slower than a server-side filter. Exposing awarding_office_code as a filter parameter on list_contracts / list_otas / list_subawards would be a real win.
Wishlist
A schema discovery helper. Something like client.get_resource_schema("Contract") that returns the field
list and nested-model field sets. Would save a lot of grepping
through explicit_schemas.py.
list_idv_awards accepting PIID directly. Currently requires list_idvs(piid=...) to resolve PIID → key first. PIIDs are stable
identifiers; this would save the lookup.
list_award_subs(piid=...) convenience method. Resolves PIID
→ contract key → subawards in one call. Doubles as guidance away
from the bulk-by-UEI scope trap (see methodology lesson below).
More examples in the docs of cross-endpoint patterns. "Find
every subaward under a given IDIQ family" is a powerful use case
but you have to assemble it from list_idvs → list_idv_awards
→ per-TO list_subawards. A documented recipe would help.
Methodology lesson (not a bug, but worth documenting)
We initially bulk-pulled all subawards for Booz Allen and GDIT by
prime UEI. On a large integrator UEI, that returned 5,000+ rows of
unrelated work (hitting the page cap from #5) and made cross-entity
link counts look stronger than they actually were: 21 "GDIT → BAH"
rows collapsed to 1 unique contract after de-duping by award_key, and 8 "BAH → Hoplite" rows collapsed to 1 unique
contract on an unrelated GSA TO, not the SOCCENT contract the
inference targeted.
The right pattern was per-PIID list_subawards(award_key=...) on the
specific contracts the investigation cared about. For small specialty
firms, bulk-by-UEI is fine; for large integrators it's a trap.
Two small docs / SDK changes would prevent this for future users:
The list_subawards docstring could mention that for large primes,
per-contract queries (award_key) are usually what you want — not
per-UEI.
The list_award_subs(piid=...) convenience method (wishlist above)
would make per-PIID the default mental model.
Thanks for building this. The SWMS-via-Tango find was the only reason
this investigation was tractable.
This started from Renée DiResta's Lawfare piece — "Fewer Bots, More
Ads: The Pentagon's Evolving Online Influence Campaigns"
— which identifies GDIT as a verified advertiser running a network of
covert-attribution websites (the "gc_" prefix sites) but doesn't
identify the federal contract the work is running under. I wanted to
see what additional context the public procurement record could add.
About ~150 API calls across SWMS IDV drilldown, OTA sweep, subaward
chains, and IT Dashboard sweeps, on
tango-pythonv0.5.0, mediumtier.
What I was trying to find
The Lawfare piece names a prime contractor (GDIT) and a class of
activity (covert ad-buys for a network of "gc_" sites targeting
foreign-language audiences in CENTCOM / SOUTHCOM / AFRICOM / INDOPACOM
AORs) but does not name the specific federal contract paying for the
work. So the question was: given a known prime and a known class of
work but no contract name, can the procurement record narrow down
the vehicle? That turned into six concrete data needs.
For each need I tried USASpending and SAM.gov first, then Tango. The
specific things Tango returned that the others didn't:
1. Find a named IDIQ family that's not in USASpending's keyword index
The trace's original §4 needed to identify what the GBPS → SWMS
"bridge contracts" of 2016-17 were bridging to. SWMS is a real
SOCOM IDIQ family, but it's invisible to USASpending and SAM.gov
keyword search:
spending_by_awardkeywords=["SWMS"]→ 0 resultslist_idvs(search="SWMS")→ 25 results, including thethree Group A IDIQs, the sixteen Group B IDIQs, and a Group C
IDV (
H9222217D0004, St. Michael's, $150M ceiling).That's the headline source-coverage gap.
2. Parent-IDV → task-order traversal as one query pattern
/award/CONT_IDV_H9222216D0040_9700/) that lists child orders ina paginated UI, but
spending_by_awarddoesn't accept anidv_piidfilter — you can't grab every TO under an IDV via APIin one call.
list_idvs(piid=...)→list_idv_awards(key=..., limit=100, cursor=...). Walked 19 of the 20 SWMS IDV PIIDs I had and pulled123 child task orders in one script, ~40 API calls. (The 20th
was the Group C St. Michael's IDV, which I'd missed because I
didn't have its PIID — Tango's
list_idvs(search="SWMS")fromneed chore: go live #1 above would have caught it, lesson learned for me.)
3. Subaward chains per prime contract, retrievable by PIID
spending_by_subaward, but I didn't do asystematic per-PIID cross-check between USASpending and Tango on
every GDIT prime in this trace, so I can't precisely characterize
the relative completeness here. Anecdotally, USASpending's
subaward feed is sparse for many older / smaller contracts; both
sources show zero subs on the same GBPS-era contracts.
list_subawards(award_key=...)returned a usable vendorstack on the two flagship GDIT primes — 61 subaward rows on WSP
(
47QFCA19F0035) and 43 on TSS (47QFCA24F0007) — which is whatlet me characterize each prime by capability composition.
4. OTA / OTIDV index, with keyword search
spending_by_awardreturns zero matches forW15QKN1791011, a $48M federal OTA whose description explicitlysays "support the US Government's shaping and influence
efforts." It's there in the underlying federal record but not
indexed in the contract-keyword endpoint.
list_otas(search="influence")surfaces it (along with~50 other relevant OTAs / OTIDVs).
5. IT Dashboard cross-reference
(docs,
schema), API-key gated.
I didn't try it directly — for this investigation Tango's
list_itdashboard_investments(search=...)made it a one-linelookup with the same shape syntax as the rest of the SDK.
Confirmed
TRWI,AWIP,CCMD,WebOps,PSYOPall returnzero DoD title matches — informative negative.
6. Cross-source comparison
candidate finding, cross-check the same PIID on USASpending.
Tango's stable response shape made the comparison clean —
key,piid,recipient.ueialways present and consistent,so the diff "Tango sees this, USASpending doesn't" was easy to
detect programmatically.
Repo layout (anonymized for sharing):
What I'd want preserved in future versions
A few things that worked and would be worth not breaking in future
SDK versions:
list_idvs(piid=...)→list_idv_awards(key=...)traversal — thecursor pagination is consistent across both endpoints and the
shape syntax carries through.
recipient(display_name,uei),awarding_office(*)) on contract / IDV endpoints.TangoRateLimitErroras a separate exception class — made theretry helper trivial. (
TangoNotFoundErrorandTangoAPIErrorare also useful.)
Friction we hit (ranked by impact)
1.
ContractOrIDVCompetitionis referenced as a nested model but isn't inEXPLICIT_SCHEMASRepro:
Expected: request succeeds with the named subfields populated.
Actual:
Root cause: in
tango/shapes/explicit_schemas.py, bothCONTRACT_SCHEMA["competition"]andIDV_SCHEMA["competition"]setnested_model="ContractOrIDVCompetition", butEXPLICIT_SCHEMASonlyregisters
"Competition": COMPETITION_SCHEMA. TheContractOrIDVCompetitiondataclass exists inmodels.pywith theright fields (its docstring even says "alias for Competition"), but
the explicit-schema parser doesn't know about it.
Workaround: use
competition(*)— returns the same data withoutfield-level selection.
Possible fix: either (a) add
"ContractOrIDVCompetition": COMPETITION_SCHEMAto
EXPLICIT_SCHEMAS, since the two are described as aliases, or (b)point the
nested_modelreferences at"Competition"to match theregistered key.
2. Server-side shape-validation errors don't name the bad field
When the client-side parser catches a bad field (issue #1), the error
is great — it names the field, names the model, even suggests
alternatives. When the server-side validator catches one (e.g. a
field that doesn't exist on the resource at all), the error is just:
No field name. We had to bisect the shape to find which field was
rejected. Would be great if the server's error matched the client
parser's quality.
3. OTA / OTIDV response shapes are much thinner than Contract / IDV
OTA_SCHEMAandOTIDV_SCHEMAexpose onlykey, piid, award_date, description, total_contract_value, obligated, recipient. Noawarding_office, noplace_of_performance, nopsc_code/naics_code, nocompetition.Meanwhile
list_otas()acceptsawarding_agency,funding_agency,psc,naics,pop_*,awarding_officeetc. as filter parameters —so the API clearly has those fields. They're just not in the response
shape.
This made triaging OTA keyword-search results awkward — we wanted to
filter the 50 keyword-matched OTAs to "awarded by H92… (SOCOM)" but
had to fall back to PIID-prefix matching because there was no
awarding_officefield on the record.Bringing OTA / OTIDV response shapes up to parity with Contract / IDV
(
awarding_office,place_of_performance,psc_code,naics_code,optional
competition) would close this gap.4.
list_subawardsshape rejectsamountandaward_datePer the comment in
tango/models.py:721:This is a real friction point for any quantitative subaward analysis:
to attach a dollar value or date to each subaward row you have to
follow the parent
award_keyback tolist_contracts(). Two APIcalls when one should do.
If
amountandaward_dateare columns on the underlying subawardtable, exposing them on the shape would dramatically reduce
round-trips. (And if they aren't — i.e. subaward dollars aren't
tracked in Tango — that's worth saying in the SDK docstring so users
don't waste time trying.)
5.
list_subawardspagination silently caps at ~5000 rowslist_subawardsuses page/limit pagination (every other list endpointwe touched uses cursor pagination) and hits a ceiling around 50 pages
× 100 rows = 5,000 rows per query. On a large prime UEI like Booz
Allen's
JCBMLGPE6Z71, that truncates before the full list isdelivered. No indication in the response that there are more rows
you didn't get.
Either:
list_contracts/list_idvs/list_otasalready use, oris hit, or
scope their queries before pulling.
The current behavior fails silently, which made one of our analyses
inflate "inter-entity link counts" until we noticed and re-scoped (see
"methodology lesson" below).
6. 43 of 50 keyword-search OTAs return
recipient: nullandaward_date: nullFor most OTAs surfaced via
list_otas(search=...), bothrecipientand
award_datecame back null. Callingget_ota(key)on the samekeys returned the same nulls — so this isn't a list-vs-detail
truncation; it's a data gap in the source.
Notable example:
OT_AWD_W15QKN1791011_9700_-NONE-_-NONE-is a $48MOTA whose description explicitly says "support the US Government's
shaping and influence efforts." No recipient, no award date. Same
PIID returns zero matches on USASpending's
spending_by_awardendpoint, so Tango is uniquely surfacing it but can't attribute it.
For comparison, OTAs returned by
awarding_agency="USSOCOM"all hadrecipient + date populated. So the data-density gap seems specific to
the search-indexed records.
May not be Tango's fault directly, but a note in the OTA docs about
which records are likely to have null attribution fields (and why)
would help users know whether to retry / cross-reference vs accept
the null.
7.
awarding_agencyparameter accepts forms inconsistentlyThere's no documented canonical form per endpoint. A
get_agency_choices()style helper, or even just a documented list ofaccepted strings, would help. (CourtListener has a similar
get_choicesMCP tool, FWIW — works really well for discoveringaccepted enum values.)
8. Exposing awarding-office filter would help (related to #3)
The PIID prefix is the best available proxy for awarding office (e.g.
H92240= HQ USSOCOM,H92277= direct USSOCOM contracting atMacDill,
HR0011= DARPA), but having to filter by PIID prefix inpost-processing is much slower than a server-side filter. Exposing
awarding_office_codeas a filter parameter onlist_contracts/list_otas/list_subawardswould be a real win.Wishlist
client.get_resource_schema("Contract")that returns the fieldlist and nested-model field sets. Would save a lot of grepping
through
explicit_schemas.py.list_idv_awardsaccepting PIID directly. Currently requireslist_idvs(piid=...)to resolve PIID → key first. PIIDs are stableidentifiers; this would save the lookup.
list_award_subs(piid=...)convenience method. Resolves PIID→ contract key → subawards in one call. Doubles as guidance away
from the bulk-by-UEI scope trap (see methodology lesson below).
every subaward under a given IDIQ family" is a powerful use case
but you have to assemble it from
list_idvs→list_idv_awards→ per-TO
list_subawards. A documented recipe would help.Methodology lesson (not a bug, but worth documenting)
We initially bulk-pulled all subawards for Booz Allen and GDIT by
prime UEI. On a large integrator UEI, that returned 5,000+ rows of
unrelated work (hitting the page cap from #5) and made cross-entity
link counts look stronger than they actually were: 21 "GDIT → BAH"
rows collapsed to 1 unique contract after de-duping by
award_key, and 8 "BAH → Hoplite" rows collapsed to 1 uniquecontract on an unrelated GSA TO, not the SOCCENT contract the
inference targeted.
The right pattern was per-PIID
list_subawards(award_key=...)on thespecific contracts the investigation cared about. For small specialty
firms, bulk-by-UEI is fine; for large integrators it's a trap.
Two small docs / SDK changes would prevent this for future users:
list_subawardsdocstring could mention that for large primes,per-contract queries (
award_key) are usually what you want — notper-UEI.
list_award_subs(piid=...)convenience method (wishlist above)would make per-PIID the default mental model.
Thanks for building this. The SWMS-via-Tango find was the only reason
this investigation was tractable.