Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Standardized API for external communication #15894

Closed
8 of 23 tasks
LukasReschke opened this issue Apr 27, 2015 · 33 comments
Closed
8 of 23 tasks

Proposal: Standardized API for external communication #15894

LukasReschke opened this issue Apr 27, 2015 · 33 comments

Comments

@LukasReschke
Copy link
Member

LukasReschke commented Apr 27, 2015

Disclaimer: I know that this is a controversial topic but I'd utmost appreciate it if before judging everything would take some time to read through my considerations. – I thought about this a long time and we either have the choice between: Adding not really 100% backwards compatible workarounds or come up with an alternate long-term approach. This issue should be considered a personal opinion and loose suggestions on how to improve stuff, nothing in this text is yet approved nor discussed.

tl;dr: Please jump to the "Summary (TL;DR)" section a little bit below.

Let's face it, ownCloud has a huge API zoo that is used at the moment (besides WebDAV et al. which is absolutely fine):

  • OCS
  • Custom AppFramework controller using annotations such as @CORS
  • Completely homebrewn solution such as REST backends for specific parts of ownCloud

These different APIs are hurting our ecosystem and the overall maintainability, we have different code paths to maintain, we have to document all endpoints and of course we have to ensure the security on all endpoints which are often completely differently implemented. We often even have multiple endpoints for the same thing, for example there is a REST endpoint to add users and the public OCS API. They don't share any code at all and thus are different code paths that needs to get verified on it's own. This is double the work for us and we have often experienced critical bugs (security ones and non-security ones) because one of the endpoints received less testing.

Such a huge widespread API zoo is an indication that developers are not happy with the existing external API functionalities and there are different reasons about it. Let me mention some of the concerns that we are facing by usage of the current OCS API:

Summary (TL;DR)

  • Security
  • Documentation / Endpoint Discovery
    • The existing APIs that we have are documented suboptimally at best. Nearly no developer likes documenting stuff when the documentation involves setting up sphinx and struggling with the syntax etc. – Even worse: Once an API method got documented and the API changes for whatever reason it is unlikely that the actual documentation gets updated and as there are obviously no unit tests for the documentation this has to be detected manually. And while this is not a technical problem the thing is that we can solve it using technical measures.
    • This also affects the documentation of the API itself, there is no sufficient documentation how to properly create OCS endpoints and unit test them etc.
  • Performance
    • OCS is currently implemented in a way that re-authentication using the session is also a possibility. However, if clients do not properly resend the authentication cookie, which we have experienced quite often and in fact the upcoming 1.8.1 client will address some of this problems with regard to WebDAV (ref https://github.com/owncloud/enterprise/issues/445#issuecomment-94430141), a complete relogin is performed and a new session created which is bad from a security point of view as well as a performance one.
  • Code Quality
    • OCS currently relies on static code and thus unit-testing it is very hard. This specific problem is related to Make OCS routes work together with the AppFramework #12454, however the original issue is about integrating the AppFramework into the current OCS API. This issue is about evaluating changes that can make the API better to use as well as more secure and thus spreading it's usage and so helping the ownCloud ecosystem.

To improve the current situation we have thus to address the above concerns and very importantly also provide backwards compatibility for existing legacy code. The proposal in this issue will address these concerns as well as provide complete backwards compatibility by providing these API methods via a new endpoint and keeping the old ones fully functional. With this proposal we gain:

  1. A secure API endpoint following security best practises
  2. Automated API documentation for all endpoints
  3. The possibility to have other websites integrate with our public API (as we can easily plug CORS over it and some alternate authentication mechanism)
  4. Non-static code that can be properly unit-tested

This issue is not about blaming existing APIs for being bad by definition. It is about evaluating alternative approaches that give ownCloud a modern and easy to use standardized API. If we want to succeed in a world where integration with external systems and security is a key component we need to evaluate what we can do to deliver the best result to our users.

Let's go into some of the problems that we face at the moment and how my proposed implementation would address these. In the following I especially compared OCS to my proposal, despite this the concerns apply also to most of the other APIs that are in some way used within ownCloud.

Endpoints

Currently the endpoints of our APIs are not really following some kind of standard such as namespaces; instead with OCS there is only a \OCP\API::register function that as argument takes the URL. So if an application registers an endpoint foo/bar it will be reachable under /ocs/v1.php/foo/bar and might interfer with other applications that register the same endpoint.

My proposal would be to have the following scheme for API calls: /index.php/api/{application}/{definedEndpoint}, this prevents route collisions and makes from a first sight clear to what an endpoint belongs to.

Examples

Route registration

The route registration actually would be similar to how existing AppFramework applications already register their controllers. In the following code snippet the method updateFileTags of the controller FileAPI would be called.

<?phpreturn [
      // Regular routes, that's what we have at the moment
      'routes' => […],
      'resources' => [...],

      // New API routes, same 
      'api_routes' => [
          [
              'name' => 'FileAPI#updateFileTags',
              'url' => '/tags/files/{path}',
              'verb' => 'POST',
              'requirements' => array('path' => '.+')
          ],
      ],
      'api_resources' => [...]
 ]

Requirements

  • API route names will be prefixed with api., e.g. api.news.folders.index
  • The request method should feature a method called isApiRequest() that returns true if an api route was called and false if its a web interface call
  • Deprecation roadmap for @cors annotation and ApiController should be created

CORS

  • Allowed methods should include HEAD in addition to the currently used PUT, POST, GET, DELETE, PATCH. Methods are not automatically generated but follow this default to minimize hard to debug preflighted CORS request caching problems
  • Lowering the CORS caching max age from 1728000 to 3600 seconds. This should be short enough for caching. Worst case scenario here is, that the author adds another header and his firefox app is not able to access the API for one hour if he does not know what's happening.
  • Custom headers should remain "Authorization", "Content-Type", "Accept"
  • Developers usually want to reuse their internal controllers for their API, so the preflightedCors method should be copied from the ApiController to the Controller baseclass. Additionally the method should be renamed to options to match the HTTP verb and make it obvious that this method needs to be called/handled in case your own API uses the OPTIONS verb. If the request is not an API request, the built in options method should throw a SecurityException to prevent accidental CSRF for the webinterface, in case someone creates an options route and does not overwrite it
  • For each API url an additional route should be registered for the OPTIONS verb that calls the controller's option method
  • ApiController will be deprecated and preflightedCors will return parent::options()
  • Instead of using string parameters for CORS like here https://github.com/owncloud/core/blob/master/lib/public/appframework/apicontroller.php#L60 there should be a CORS object, much like the ContentSecurityPolicy object, so we can extends CORS in case the spec is changed without having to modify the parameter length. The object contains getters and setters which are named after their respective CORS counterparts, e.g. Access-Control-Allow-Methods would become CORS setAccessControlAllowMethods($string) and string getAccessControlAllowMethods(). Like the ContentSecurityPolicy object, the setters should be chainable by making use of the train wreck pattern (e.g. $object->setX()->setY()->setZ())
    • CORSMiddlware should be merged with SecurityMiddleware, Logic should be moved to separate classes to be able to easily achieve a good code coverage (since this is very important). Negative and Positive testing should be applied here.

Authorization / OAuth 2.0

The current implementation of OCS allows authentication using the Basic Authentication header as well as the session cookie in combination with the OCS-APIREQUEST header set to a value of true. In theory this should be a somewhat sufficient protection against Cross-Site-Request-Forgery as the Same-Origin-Policy prevents websites from setting arbitrary headers on requests to other domains. (that is for example if a script on owncloud.com tries to request mycloud.ch)

The reality is that there are and will always be browsers that tend to have a different implementation than we expect. Of course, one could argue that all browsers have to work "like they should" but the reality is that the web is a highly dynamic platform and standards are not really implemented all the same or the intended way. As a responsible vendor we have to deliver a solution that is always secure and this comes only with a "defense in depth" strategy.

Current security best practise to mitigate all CSRF-like problematics is to not rely on the Basic Auth as well as not rely on sessions. Instead the authentication should happen using a custom Authorization header. An standardized approach is here to implement the OAuth 2 standard.
In the OAuth this would be Authorization: Bearer $longRandomToken, the token can be requested via a not yet defined endpoint. There might be multiple tokens and tokens might expire or get invalidated by other means, applications should in case of a 403 try to request a new token. (by asking for the user password or using an Refresh Token) Tokens are stored in the database and referenced to a user. Users of the web-interface will be able to request some token automatically which will be applied to requests that use the jQuery AJAX functions. Thus the actual technical challenges are hidden from the single developers.

For simplicity the OAuth token will also be valid for WebDAV, this means increased performance and now handling of cookies anymore. At the beginning an actual implementation might also still rely on sessions for WebDAV but this is still a huge gain as we can improve partially further in the future.

This approach gives us the following for free:

  • Security: Correctly implemented (!) OAuth has multiple security advantages over our current implementation
  • Extensibility: We can start with having a single permissions "all" which basically grants access to everything. In the future we can introduce changes that allow restricting the tokens' validity. (see OAuth 2.0 support for ownCloud [$5] #10400 for some brainstorming on this)
  • Revokability: Ability to revoke access for single clients (see Show clients connected to the account #6120)

While we also could implement OAuth2 on top of our existing OCS API this would technically not solve the security concerns as the endpoint would still be "not optimally secure", with two endpoints we can over the time phase out the old ones and migrate to the new one. Also the OCS code is rather grown over the time and creating new code would probably be more time efficient.

Examples

API Request

GET /master/index.php/api/settings/maintenance HTTP/1.1
Host: localhost
Authorization: Bearer LongRandomToken

Requirements

  • No PHP session is started at all. This does possibly require some changes in our session handling.
  • The jQuery AJAX methods are modified to send this header automatically to the API endpoint so that ownCloud application developers have not to deal with the technical background.
  • Compatible with WebDAV

Controllers / Code Quality

Static code is in most cases not the best idea as it usually creates a huge maintenance burden, we all know this. A detailled post by me explaining why integrating the AppFramework can be found at #12454 (comment).

Examples

Controller

Basically the controllers would be plain simple AppFramework controllers:

/**
 * Updates the info of the specified file path
 * The passed tags are absolute, which means they will
 * replace the actual tag selection.
 *
 * @NoAdminRequired
 *
 * @param string $path Path to the file to change
 * @param array|string $tags array of tags
 * @return DataResponse
 */
public function updateFileTags($path, $tags = null) {
    return new DataResponse([]);
}

Automatic documentation / Endpoint Discovery

Using the AppFramework we are able to automatically generate a lightweigth documentation of the API using the PHPDoc annotations. This provides us with:

  • A list of endpoints per application
  • The parameters that an endpoint allows and what it is supposed to do

The current APIs are not all documented and those that are, are often insufficiently or incorrectly documented. The experience has shown that developers tend not really to document APIs manually using external tools such as sphinx.

Examples

List all applications that have APIs registered

A request to /index.php/api/ will return a JSON blob of all applications that provide API endpoints:

{
    "apps": {
        "cloud-federation": "https://example.com/index.php/api/cloud-federation/",
        "settings": "https: //example.com/index.php/api/settings"
    }
}

List endpoint description of an application

The endpoint description of applications can be requested by opening the API endpoint /index.php/api/{appName} (this is also the URL returned in above response). The following pseudo-code would result in the described output:

<?phpnamespace OCA\Settings\Appinfo;

$application = new Application();
$application->registerRoutes(
        $this,
        [
          'api' => [
             [
                  'name' => 'User#listUsers',
                  'url' => '/users',
                  'verb' => 'GET',          
             ],
             [
                  'name' => 'User#create',
                  'url' => '/user',
                  'verb' => 'PUT',          
             ],
             [
                  'name' => 'User#getInfo',
                  'url' => '/user/{username}',
                  'verb' => 'GET',          
             ],

]
);
/**
 * Returns a list of all users
 *
 * @return DataResponse
 */
public function listUsers() {
        return new DataResponse();
}

/**
 * Creates a user
 * @param string $username Username of the to creating user
 * @return DataResponse
 */
public function createUser($username) {
        return new DataResponse();
}

/**
 * Get information about a single user"
 * @param string $username Username of the user to request information about"
 * @return DataResponse
 */
public function getInfo($username) {
        return new DataResponse();
}

The JSON key does reference the URL; in this case the users would reference the /index.php/api/settings/users endpoint.

{
   "users":{
      "GET":{
         "description":"Returns a list of all users"
      },
      "PUT":{
         "description":"Creates a user",
         "params":{
            "username":"Username of the to creating user"
         }
      }
   },
   "user/{username}": {
    "GET":{
         "description":"Get information about a single user",
            "params":{
            "username":"Username of the user to request information about"
         }
      },
    }
}

Output

The output is expected to default to JSON but might differ, binary content may get returned as well. (for example for a thumbnail API) – We could also indicate this in the documentation since the return type is annotated.


# Proposed Action Plan

As I said this is only an proposal and thus this action plan is only here to give us the ability to discuss about the feasibility / timeline etc.

  • 8.2
    • Implement the API as suggested and split into a new subticket
      • Sharing API
      • Provisoning API
      • Tagging API
    • Create proper documentation of the usage
    • Do new developments only using the "new" API (the new endpoint format + app framework based API)
  • Long-term
    • Adjust further existing OCS endpoints and migrate clients to the new endpoints

# Open questions - This way we do have a documented API. But we might want to also differentiate between "public" and "private". Where the "private" one is subject to change at any time. Does anybody has suggestions what we could do to improve here?

@karlitschek @deepdiver @PUCELA @butonic @MorrisJobke @rullzer @Raydiation Please comment and let us discuss solutions for the problems we face here.

(That said I'd even volunteer to get the 8.2 steps done if we can agree that we want to change this.)

@LukasReschke
Copy link
Member Author

That said this issue is obviously also here to discuss how we could achieve the same usability / documentation and security gains using the existing OCS endpoint. If somebody has feasible ideas I'd utmost welcome the input. – Consider it brainstorming.

@BernhardPosselt
Copy link
Contributor

Wow huge ticket :)

Automated API documentation for all endpoints

VERY VERY IMPORTANT!!! Django Restframework generates in a form like this http://restframework.herokuapp.com/ for instance which is just plain awesome.

I like the API keyword, we have to consider resources too. Also a linkToApi function makes it less flexible in my opinion. What about adding an api key like ['name' => 'file_api#update_file_tags', 'api' => true] which will get a route name based on a convention, for instance api.app.file_api.update_file_tags (put the .api where you like :))?

Automatically generated API Documentation is probably the most important feature of this suggestion, as for "migrating" older APIs: We can keep the old ones around for a long time until we can say that all important apps migrated to the new API.

As for public/private APIs: once you've made an API accessible, it's here to stay. You can however version things pretty easily by just sending a http header that has the required version.

In addition you could just add new annotations to describe the stability like here: https://github.com/kijiproject/annotations/blob/master/src/main/java/org/kiji/annotations/ApiStability.java this will be considered by generating the API and use a color that signals instability like orange or red. @internal maybe? http://phpdoc.org/docs/latest/references/phpdoc/tags/internal.html

@rullzer
Copy link
Contributor

rullzer commented Apr 27, 2015

Yes!!! 👍
I would like to bring up that we have certain OCS endpoints that require a somewhat special handling. Capabilities etc. However this is a corner case and the proposal seems very solid in general.

I think we can all agree pretty quickly that OAuth2 support is very welcome and will help with a lot of issues/future work. However, like @LukasReschke pointed out this only works out if we change core parts of the current approach.

The automatic documentation seems like a nice idea. However, it of course still does not help with people not updating the docs. But having the docs so easily editable seems like a good incentive to fix it right away.

The ideas proposed here sound like a very good start!
@LukasReschke you can count on my help implementing.

Having said that. A way to fix the current code is of course also fine. However I do not really see how we can fix a lot of the things without breaking the current stable API. Since the current code is extended to many times with quick hacks (some of which I am guilty). There is just stuff in there that does not make a whole lot of sense. This would directly allow for fixing some of those issues.

@BernhardPosselt
Copy link
Contributor

BTW regarding API deprecation: we should rely on hard data, aka how many apps and customers rely on the API and only remove them once we can say usage is neglible. Add new features to new APIs only to motivate people to migrate :)

@BernhardPosselt
Copy link
Contributor

As for the docs I'll propose the following:

We are going to add a middleware (or modify the CORSMiddleware into APIMiddleware) that does the following:

As for documenting the API I'm thinking about something very similar to http://restframework.herokuapp.com/

Will post a quick example very soon

@BernhardPosselt
Copy link
Contributor

The basic idea is that the api doc should make use of a lot of phpdoc features, link the apis so you can change quicky, show verbs and possible paramters (also which parameters are in the url) and provide a form that allows you to query the API

@DeepDiver1975
Copy link
Member

Generating API docs from PHPDoc - yes
Adding doc generating code to the live code - no - divide and conquer

@BernhardPosselt
Copy link
Contributor

@DeepDiver1975 have you taken a look at http://restframework.herokuapp.com/ ?

It could only be enabled when debug mode is active, would always be up to date for the currently installed apps/apis, would work for all apps that have api routes and can be tested live (including syntax highlightning for JSON/XML).

The alternative is basically creating a wiki like https://github.com/owncloud/news/wiki/API-1.2 manually keeping stuff up to date and writing a complex parser or enhance a current parser to generate this stuff.

@BernhardPosselt
Copy link
Contributor

BTW: another, less intrusive way would be to put this into a separate app that adds an additional middleware if enabled.

@DeepDiver1975
Copy link
Member

@DeepDiver1975 have you taken a look at http://restframework.herokuapp.com/ ?

no need for me to have a look at this - I've use tools like this in the past. To be honest this adds more issue then it actually solves.

Furthermore I'm convinced that api docs should be more of the formal specification the code has to follow - not the other way round. Devs rely on stable APIs as documented and released - dynamic doc generation is the wrong approach.

@BernhardPosselt
Copy link
Contributor

So basically stick with the way we are already documenting APIs?

@DeepDiver1975
Copy link
Member

So basically stick with the way we are already documenting APIs?

there are for sure rooms to improve - e.g. today we write the docs (if at all) after the code was written - which is a bad practice

@BernhardPosselt
Copy link
Contributor

Indeed. Maybe add a policy that requires public APIs to have a documentation PR before they are merged? Would fix a lot of app API documentation issues, I'm basically the only one documenting this thing and filesystem and hooks are still not documented well enough.

@BernhardPosselt
Copy link
Contributor

Some things that are unclear from an API design point of view:

  • The name of the route should probably be api.app_id.controller.method, e.g. api.news.folders.index
  • What about API resources? e.g.
    • 'api' => [name => 'folders#index' ..., 'api_resources' => [name => 'folders#index'...
    • 'api' => ['routes' => [name => 'folders#index' ...], 'resources' => ['folders'...]]
    • 'resources' => [['folders' => ['url' => '/folders', 'api' => true]]]
  • Do we want to add allow overriding cors parameters in the routing instead of controller baseclass? e.g 'api' => [name => 'folders#index', 'cors' => ['allowed_headers' => 'Authorization, Content-Type, Accept', 'max_age' => 123213123123] That way we could reuse controllers.

@DeepDiver1975
Copy link
Member

Some things that are unclear from an API design point of view:

What is the difference between an API route and an non-API route? Nothing technical at the end but a convention. I'm not sure if we need additional declarations.

@BernhardPosselt
Copy link
Contributor

@DeepDiver1975 i meant this specific api design proposal :) badly worded, my bad

@BernhardPosselt
Copy link
Contributor

We could ofc also turn every route into a web interface and api route. That way the api is automatically generated. The question is ofc if there are security implications because every internal route is available externally (which is already the case I guess).

For instance ['name' => 'page#index', 'url' => '/test'] would create the following URLs:

  • /apps/appid/test
  • /api/appid/test

Depending on which api is accessed additional security checks are turned on or off, implementation would be easy, same for migrating to the new api.

@LukasReschke
Copy link
Member Author

We could ofc also turn every route into a web interface and api route. That way the api is automatically generated. The question is ofc if there are security implications because every internal route is available externally (which is already the case I guess).

I'd really like to avoid this. Sounds like something where things can easily somehow explode for whatever reason. Clear definition of API and Web Interface route sounds more bullet proof to me.

@BernhardPosselt
Copy link
Contributor

@LukasReschke downside is ofc that all of our internal APIs become external to rephrase your answer

@BernhardPosselt
Copy link
Contributor

Another thing that comes to my mind: Currently authentication is run before dispatching anything and all authentication methods are possible. As seen here https://github.com/owncloud/core/blob/master/lib/private/appframework/middleware/security/corsmiddleware.php#L79 we need to actively undo all the authentication work to exclude session auth for api routes for instance

@LukasReschke
Copy link
Member Author

Post has been updated with some thoughts of @Raydiation and me. A diff can be found at https://gist.github.com/LukasReschke/32fe5cc14e5764a7ef34/revisions

@DeepDiver1975
Copy link
Member

To finally rephrase what I did mention many times:

  • the current OCS API has to continue to operate properly
  • we shall no change the behavior of the existing v1 OCS API
  • we need to be flexible to allow a v2 of OCS API to be implemented using whatever mechanism we are going to introduce
  • before changing the API we require integration tests (using pyocclient ??) to have a reference to ensure we do not break anything.

@BernhardPosselt
Copy link
Contributor

OK, after working on automatic OPTIONS route generation for api routes I'm starting to think that this is not possible. We should just keep the current behavior and let devs route preflighted cors requests to the preferred controller.

The reason for this is that you can not automatically determine which controller should be used for the options request. Consider the following route example:

['files#test', 'url' => '/test', 'verb' => 'GET'],
['folders#test', 'url' => '/test', 'verb' => 'POST']

On which controller should the OPTIONS request be looked up? FilesController? FoldersController?

@MorrisJobke
Copy link
Contributor

cc @felixboehm because we talked about this yesterday ;) Have a look

@elimisteve
Copy link

Please implement this; I'm looking to create a Go client library for ownCloud and it is very unclear. Would you recommend looking at https://github.com/owncloud/pyocclient to figure out the relevant API URLs? Thanks!

@MorrisJobke
Copy link
Contributor

Please implement this; I'm looking to create a Go client library for ownCloud and it is very unclear. Would you recommend looking at https://github.com/owncloud/pyocclient to figure out the relevant API URLs? Thanks!

@elimisteve Use webdav for the general file access and the OCS API for everything else like sharing or user creation. In ownCloud 9.0 we added a more general webdav endpoint that also allows to manage clad, carddav and comments or tags.

@elimisteve
Copy link

Thanks @MorrisJobke, that should simplify things dramatically. Looks like this is the latest link to the Share API?

@MorrisJobke
Copy link
Contributor

Thanks @MorrisJobke, that should simplify things dramatically. Looks like this is the latest link to the Share API?

@elimisteve Correct - we introduced a v2.php for the OCS API in 8.2 which is better to use, but this needs to be explained by @rullzer. (I guess there is some documentation for this missing :()

Maybe come around in IRC #owncloud-dev on free node to not highjack this issue here.

@nickvergessen
Copy link
Contributor

ocs/v2.php has only one difference to ocs/v1.php. It sends the APIs status code as HTTP status code, when it is between 2xx and 5xx, API 100 is converted to HTTP 200, and everything else gets HTTP 400.

In v1 the HTTP status code was always 200.

PS: that means that if the API was designed for v1 (like sharing) you should use v1, because otherwise you get weird results.

@rullzer
Copy link
Contributor

rullzer commented Feb 18, 2016

So there is some confusing with the API numbers... which is understandable (since it is not that clean)..

The OCS endpoint has two version. v1.php and v2.php. And as @nickvergessen points out there is just different status code stuff going on there.

However the OCS Share API has only 1 version so far. But you can use the OCS Share API v1 with both versions of the OCS endpoint.

@PVince81 PVince81 added this to the backlog milestone Mar 7, 2016
@ownclouders
Copy link
Contributor

Hey, this issue has been closed because the label status/STALE is set and there were no updates for 7 days. Feel free to reopen this issue if you deem it appropriate.

(This is an automated comment from GitMate.io.)

@PVince81
Copy link
Contributor

Some still relevant points about our APIs, some still needing adressing.

@stale
Copy link

stale bot commented Sep 20, 2021

This issue has been automatically closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants